Goodtables

Goodtables is a free online service for tabular data validation, developed by the Frictionless Data team of the Open Knowledge Foundation. This open source tool will check basic structural errors such as blank or duplicate rows, duplicate headers, whether all rows have the same number of columns, etc. The data can be validated by providing a URL to the file (e.g. link to GitHub repository) or by uploading a file. Several formats are admitted: csv, excel, LibreOffice, Data Package, etc. Besides, a data schema can be uploaded to enable further checks, such as whether the data type (e.g. date), format (e.g. YYYY-MM-DD) and possible data constrains (e.g. no later than 2000-01-01) are respected. Documentation about the tool is available at: http://docs.goodtables.io/index.html

DataWiz Knowledge Base

The knowledge base’s of the DataWiz is a complete RDM guideline for Psychology research to support or complement the use of the DataWiz data management tool. The content is structured in three sections: before, during and after data collection & analysis. The first section covers data management planning as well as the various legal and ethical aspects related to data management. The second section focuses on best practices and tips for handling and documenting data during research. Finally, the last section focuses on how to share and preserve data at the end of the project.

The R workshops and the R café

Utrecht University organises regular workshops to teach R basics: data handling and visualisation, and making research reproducible with R and R Markdown. The R Café has a more informal set-up, where researchers with R programming skills can meet and learn from each other, or from prepared exercises.

Data Cleaning with Open Refine for Ecologists

Data Carpentry has developed this course of data pre-processing with Open Refine, an open tool to work with data. The course covers several topics such as error correction and data formatting and harmonization.

Version control tools & techniques handout

The Massachusetts Institute of Technology (MIT) has developed a series of file organization handouts. The handout for version control briefly summarise different techniques for version control and provides an overview of the main differences between automatic change log platforms and tools.

Version control with Git course

This course prepare by the Software Carpentry guides through how Git (and GitHub) can be used to manage versions during a project. It starts with the basics (setting up Git and creating a repository), and follows with practical guidelines to track changes, collaborate or resolve conflicts. It has also dedicated sections about the impact of version control on Open Science, licensing and citations.

Guidelines and examples of transcription of qualitative data

The UK Data Service has compiled a set of instructions and best practices to transcribe qualitative data from interviews. This guide seeks to provide advice to ensure methodological consistency and to increase the shareability and reuse of qualitative research data. It provides links to further instructions, examples and a template transcriber confidentiality agreement.

Data processing recommendations for Social Sciences

The CESSDA (Consortium of European Social Science Data Archives) Data Management Expert guide provides a specific chapter about processing data, which includes tips and examples on topics such as quantitative and qualitative coding, adequate weights of survey data and data quality assurance.