The “open” aspect of data is very important, not only for transparency of the research but also for verifiability and reusability of the data in other contexts. With the “open” aspect come a number of challenges, for example, anonymity without losing quality. This is particularly important in medically related data. As such, the foundation also aims to support openness of data in different ways.
Storage of data sets
- Ensuring data is legally secure, open and potentially usable.
- Promoting data usage by making them accessible and findable.
- Cataloguing information about selected results obtained from data to inspire further research and implementation.
- Creating benchmarks for stored datasets. A typical benchmark consists of:
- a full description of a specific problem/task,
- description of the quality measure (result) specific to the problem/task,
- linking the problem to specific datasets,
- presenting the best results obtained on the above harvest.
How the foundation can help?
- Consultations on assessing the potential of data usability
- Anonymization methods
- Data preprocessing and creating features
- In creating data descriptions for machine learning tasks
- In creating sample baseline models
Consult us if:
- You have data and want to make it public.
- You want to collect data and you have a potential source, but you don’t know how to start?
- You want to find out if a particular problem is suitable as a problem for machine learning.