Data management guidelines for research studies

These guidelines are designed to help ensure that research data is collected, stored, processed, and shared in a way that upholds integrity, security, and compliance with ethical and legal standards. A well-managed dataset should be accurate, secure, clearly documented, and readily shareable. Following these practices will protect participants, safeguard research integrity, and enhance the reproducibility and impact of your study.

1. Planning and documentation

  • Storage and backups – Identify where data will be stored and how it will be backed up.
  • Roles and responsibilities – Clearly define who in your team is responsible for collecting, handling, storing, and analysing data.
  • Data sharing strategy – Consider how, when, and with whom data will be shared during and after the study. Ensure that appropriate data sharing agreements are in place.
  • Change logs – Maintain a record of all modifications, corrections, or updates to the dataset.

2. Compliance and training

  • Complete training in data protection, security, and information governance.
  • Maintain audit trails showing who accessed the data, when, and what changes were made.
  • Be aware of GDPR and local data protection policies.

3. Data collection

  • Use standardised data collection formats across the team.
  • Validate data at entry and perform regular checks to spot inconsistencies or anomalies.
  • For critical information, use double data entry or verification methods.
  • Ensure participants provide informed consent for data collection and use.

4. Data storage and security

  • Store files in approved storage areas (eg, NCA shared servers).
  • Personal devices or unapproved storage services must not be used.
  • Restrict access to authorised team members only.
  • Use strong, unique passwords and enable multi-factor authentication where available.
  • Perform regular backups and test recovery processes periodically.

5. Data sharing and publication

  • Anonymise or pseudonymise datasets to protect participant identities.
  • Encrypt sensitive datasets before transfer. Seek Information Governance (IG) advice for secure transfer methods.
  • Provide appropriate metadata and documentation to ensure datasets are understandable by others.

6. Long-term preservation

  • Archiving – Deposit final datasets in secure, stable environments. Use the Research and Innovation eArchiving procedure where appropriate.
  • Retention periods – Adhere to funder, institutional, and legal requirements for how long data must be retained.
  • Format migration – Review storage formats periodically and migrate files to open, non-proprietary formats to prevent obsolescence.

 

Excel tips for research data management

Excel is a powerful tool for data analysis, but it is not always the best choice for storing large or collaborative datasets. If your dataset becomes very large or requires multi-user input, consider using a database system (eg, REDCap, Access) and reserve Excel for analysis, cleaning, or visualisation. When using Excel, follow these best practices:

1. Data structure and consistency

  • One variable per column, one record per row. Avoid merged cells or mixed data types.
  • Use clear, consistent headers which are unique (eg, Week1_DateOfVisit).
  • Keep raw data separate from analysis and charts.
  • Create a data dictionary describing each variable and coding scheme (eg, 1 = Yes, 0 = No).

2. Data validation and error prevention

  • Apply data validation rules (dropdowns, ranges, date-only fields).
  • Use conditional formatting to highlight duplicates, missing values, or outliers.
  • Freeze headers to keep them visible while scrolling.
  • Avoid blank rows or columns, which can interfere with analysis.

3. Protecting and backing up data

  • Password-protect sensitive files and store them in restricted network areas.
  • Use read-only mode to prevent accidental edits.
  • Save regularly and keep backups in secure, separate locations.
  • Maintain snapshot copies of the dataset at regular intervals during the study.
  • Do not store sensitive research data in personal cloud storage or devices.

4. Analysis and reproducibility

  • Document formulas and analysis steps (eg, in an ‘analysis notes’ sheet).
  • Avoid overwriting raw data—clean or recode values in new columns.
  • Use named ranges and tables for clarity and reduced error risk.
  • Use pivot tables for summaries instead of manual calculations.

5. Sharing and archiving

  • Save in non-proprietary formats (CSV) for long-term storage or sharing.
  • Remove personal/sensitive information before sharing externally.
  • Record the Excel version used, as features may vary across versions.
  • Lock formulas and worksheet structure before distributing to collaborators.
Skip to content