FAQ - DPUK Data Portal

Applying for Data

What is the procedure for accessing cohort data?

Data access applications are very simple via DPUK. The online form allows essential information on a proposed study to be captured and sent to cohorts for their approval. As per the general timescales, once a study application is submitted, the form is sent by DPUK to cohort contacts, and a decision is then made. If a study is approved, DPUK will arrange with the applicants for signature of the DPUK Data Access Agreement, and upon receipt of a signed copy, make the cohort data available within the analysis environment.

Will researchers need to complete multiple applications for each cohort they wish to access?

DPUK aims to make all aspects of dementias data research easier for all parties concerned. The DPUK application process allows a researcher to make one application for data from multiple cohorts, with each cohort giving their decision within a specified time period. This way, even though individual cohorts will consider applications, approvals will be managed centrally via DPUK, with the possibility to have a dialogue with a cohort during the scrutinising period.

What does the 90-day timescale for data access actually entail?

The full 90 days to access is counted once an application is submitted. DPUK will aim to process applications as efficiently as possible.

The timescale is broken down as such:

Up to 28 days for a cohort (via their designated contact or PI) to make a decision on an application. We anticipate that any amendments and queries can also be addressed during this period, but of course, this could take place at any stage of engagement. Following approval, researchers must arrange signature of the DPUK Data Access Agreement. Simultaneously, DPUK will arrange for the data to be available upon receipt of the signed Agreement. We estimate the signing process to be between 15 and 45 days depending on the Institution. Further time may be needed if a researcher has requested bespoke software for their study, or a cohort is preparing subsets of data themselves for upload to the Portal. However, we still aim to make this available within 90 days of application.

If cohorts refuse to provide their data in the context of an application, what happens?

If you apply for access to the data from four cohorts and one refuses, for example, the Data Access Agreement would be drawn up to incorporate only the cohorts who have agreed to allow data to be accessed. Normally cohorts do provide a reason for denying access, which can be relayed back to the applicant. In the event that a significant number of cohorts refuse access to data for a study it may be necessary for a researcher to evaluate whether the proposed project remains viable or whether it would be better to reapply to different cohorts.

Can you modify a submitted data access proposal?

Yes – if submitted, DPUK will facilitate minor changes to content of the submission or addition of cohorts, usually by email. If major changes are needed however (complete change of cohort choice, new analysis plan etc.), we may advise a submission of a new application in the interest of clarity.

Is there a way to look at the completeness of data before you apply for access to a cohort? In other words, can you see the specific data you need are available?

It is not possible to view the data held in the Portal before making an application. DPUK is refining tools that will provide more details about the cohorts, which will be in the form of data availability tables, data dictionaries, and metadata discovery tools.

Is there any way of altering a project end date if you find out that the one chosen in the beginning was not appropriate?

There are no strict rules on this but it would be possible to provide a justification/rationale for an alteration to the end date. Much in the same manner DPUK facilitates initial approval we would send details of any proposed study extensions to named cohorts for their comment and approval.

Who should I contact to discuss my application?

In the first instance please contact the Swansea team (helpdesk@chi.swan.ac.uk, Tel: 01792 604366).

Why do I need approval from the cohorts?

The Data Portal was established with an ethos of sharing and collaboration but cohorts were also assured that they would be able to retain control of who accessed their data, and for what purpose. To ensure this the DPUK application process was developed but it has been streamlined as far as possible to try and facilitate rapid access for researchers.

Do DPUK plan to incorporate additional cohorts into the portal?

DPUK remains in the process of concluding discussions with a number of cohorts who are completing the legal formalities. New cohorts who wish to explore joining DPUK should contact Emma Squires, Data Project Manager, (emma@chi.swan.ac.uk) in the first instance.

Data Types and Categorisation

What kind of data can be accessed? Subject-level or aggregate data?

Please do consult our cohort matrix and cohort directory for specific details (https://portal.dementiasplatform.uk/CohortDirectory). The Data Portal will primarily store subject-level phenotypic data as standard. However, there will be some datasets available in aggregate form together with summary data derived from genetic and imaging studies.

Will DPUK have standardised psychological and behavioural test data on the Data Portal?

Although DPUK takes data ‘as is’ directly from cohorts, many of them will use standard behavioural and cognitive tests such as MMSE, MoCA etc. DPUK does not insist on cohorts standardising their own data collection and DPUK’s metadata tools will allow researchers to identify cohorts who have used the same tests. Cohorts will of course employ these scales differently; however, DPUK will seek to release variables by category, in order that areas of study can be captured by topic, such as cognitive data, lifestyle data, and metabolomic data being available for request as a grouped set of variables.

Can I request specific data categories from a cohort?

- This is highly desirable as cohorts are often not keen to provide complete datasets where a strong scientific justification has not been provided to do so. The application form is configured to show which categories of data are available when a specific cohort is selected.

Data Storage, Access and Analysis

How is data stored, accessed and analysed?

Data Storage

Cohort data transferred to DPUK are physically stored on servers at Swansea University accredited to ISO 27001.

Data Access

The data are then provided within a virtual desktop infrastructure to researchers who have had their applications approved. The virtual desktop is a Windows desktop that is accessed via the software VMware Horizon View Client. The Client allows access to the desktop, hosted in Swansea, by two factor authentication – the combination of username and password, with either a Yubikey encryption device, or the more recently adopted Google authenticator code.

Data Analysis

Once connected, researchers will find data files they have been afforded access to within shared folders on the desktop, where they can then open such files in the provided software. DPUK aims to allow researchers who specialise in particular software to come to DPUK irrespective of their specialisation, as DPUK provides a variety of software in the Windows desktop, such as Design Studio, WinSQL, R, Eclipse, SPSS, SAS, and STATA.

What are the rules on using data within the virtual desktop?

The datasets available from cohorts and any other providers cannot be removed from the virtual desktops; however, any derived results and outputs (e.g. for posters or preliminary data in support of grant applications) can be removed via the ‘data out’ process. This is a simple process of submitting files to DPUK to approve their release, which can be done from within the virtual desktop.

Researchers who wish to use derived results and outputs should follow the DPUK Publications process (see Publication Policy: https://www.dementiasplatform.uk/about/policies which involves the use of the DPUK logo on posters, for example.

DPUK expects researchers to follow the analysis protocol outlined in their application, and only data that has been applied for will be supplied to researchers. Researchers who already own, or have been approved for access to data outside of the Portal can bring this in to the Portal using the ‘data in’ mechanism. The use of such data must be stated in the submitted application.

How do I get access to my own data to use on the Portal?

DPUK has a ‘data in’ process, which is a standard feature available to all researchers. The mechanism allows for automatic approval of upload to the virtual desktop environment. Files can then be retrieved following the ‘File In and Out’ link within the virtual desktop, and subsequently analysed within the Portal.

Is it possible to add bespoke analysis tools in the Portal?

Yes, this can be considered. Subject to request, there are mechanisms for either DPUK installed required software, or a researcher bringing this in themselves. Please contact Emma Squires, Data Project Manager, at emma@chi.swan.ac.uk, for further details.

Is it possible for a team of researchers who may be geographically dispersed to work simultaneously on a dataset and store and access their analyses?

Yes, provided all the individuals have been named on the study and/or have signed a data access agreement granting access to data. The Data Portal desktops are user-specific in terms of access permissions, however all researchers on a study have access to the same data and are provided with shared network folders to save collaborative work into, as well as their own personal directories.

How much space is available on the virtual desktop?

- This depends on the technical specification of the desktop you selected at the time of application. A Standard desktop will be provided (8GB RAM, 4CPUs) unless you have asked for a Large (32GB RAM, 8CPUs) or XLarge (128GB RAM, 16CPUs) version. A charge may be made for anything other than a standard desktop and you will be advised of likely costs at the time of application. Users benefit from a scalable storage system which responds to the size of data and results produced from analysis.

Who should I contact to discuss data storage or access?

Please contact Emma Squires, Data Project Manager, at emma@chi.swan.ac.uk.

Cohort Data and Portal Access Charges

Are there any costs involved for accessing cohort data?

The DPUK Data Portal is predominantly a free-to-use resource for researchers across the world., Any bespoke arrangements (see below) may incur a charge. If a researcher is happy to make use of the standard provided infrastructure, including software and storage, there is no cost for analysis. We will only charges for studies where a researcher wishes to use bespoke software, scaled-up compute power and memory.

Further charges may be incurred if a researcher wishes to use data from a cohort that makes a specific charge. Details of such cohorts are given on this page.

Changes to Approved Studies

Can I modify a study once it has begun?

Yes – once a study is underway, there are a few options for modifications to take place. Should the modification be a time extension request, then we will ask the necessary cohorts to approve the new end date in the same manner as the original approval. If more cohorts are to be added to the study, we can simply use the original proposal and ensure it is sent to the new cohorts for approval, the data will then be made available subject to this approval. Should a study need to completely change its topic, methodology, or cohort selection, we would advise creating a new submission to DPUK, in order that the new study proposal is considered appropriately.

Study results

Do I automatically have permission to publish the results of my research using data from the portal?

Researchers should follow the process described in the DPUK Data Access Agreement. Briefly, this involves written noice to Swansea 30 days prior to the submission of any publication. Swansea will liaise with the data providers who have 20 days to object to, or request a delay to, the publication for accuracy/patent reasons. If objections are made, the Swansea team will work with the objector and publishing team to facilitate discussions, with the aim of resolving any outstanding issues. Ultimately researchers also need to adhere to the DPUK Publications Policy at https://www.dementiasplatform.uk/dpuk-policies.