1. What is "open data"
Open data (and content) can be freely used, modified, and shared by anyone for any purpose (http://opendefinition.org/).
"Cohesion Open Data" is maintained by the European Commission to share unique, structured data generated from the monitoring of EU Budget funded, shared management programmes. The unique data relates primarily to different forms of financial inputs (planned and implemented), EU payment data and to outputs and result indicators of the physical results of the funded programmes.
Find out more about #CohesionOpenData: Website - Blog - FAQ - User Guide to ODP - User guide to visualisation - Browse data stories
2. How to access the platform
Direct link: https://cohesiondata.ec.europa.eu/
Also accessible from Inforegio under "Information sources" – "Cohesion Open Data"
3. How to access the data
The data behind the ESIF visualizations can be accessed and downloaded from the platform in two ways:
2. If you know which dataset you want to manipulate or are searching for specific data or charts, search in the “Catalogue” link found on the home page => https://cohesiondata.ec.europa.eu/browse
The catalogue gives access to a library of open data objects with a search tool with various filters.
The following search functions can be used cumulatively:
- Free search (1);
- "Categories" tags (2) - programme periods, dataset types;
- "View types" tags (3) - the main types used are datasets, charts, filtered views and stories;
- Word "Tags" (4);
- "Sort by" tags (5).
4. What is a Metadata page?
4. What is a Metadata page?
Each dataset has a metadata page with its own URL. It introduces users to the dataset and gives important information about the
dataset as follows:
- Data description;
- Date of last update;
- Links to featured content, i.e. graphs created by the data owner;
- Key details about the dataset: Usage, owner, contact, licence, etc.;
- Structure of the data;
- Related content (i.e. graphs created by other users).
Before you start reusing the data it is essential to understand what the data relates to and how it is
structured in order to make useful interpretations and charts. The metadata is there for that purpose. For
instance the metadata highlights important characteristics as follows:
- In all cases for 2014-2020 the values are expressed cumulatively not annually. Some other datasets have annualised values;
- The data may be presented in annual snapshots. In those cases you can compare annual timeseries, but never aggregated the values from differnent years;
- Financial values are always expressed in current prices.
5. Create a profile to fully exploit the platform functionalities
Create a profile with you own password by enrolling on this site: https://cohesiondata.ec.europa.eu/login
With a profile you can filter, create and save objects on the platform for reuse by you and other users.
6. Interacting with and reusing the data
Once you “view data” from the metadata page the platform offers several ways to interact with the
data.
You can for instance filter a dataset to keep just one country, one Thematic Objective, one fund; you can
create charts or other types of visualizations to illustrate a report or to embed on a webpage.
All the functionalities are accessible in the toolbars indicated below. These are explained below.
6.2 Filtering / filtered views
This tool works like the pivot table in Excel. You must first define the filter conditions.
- Users can add filters based on column titles in the dataset => "fund", "dimension_type";
- Choose the condition to be applied => "is one of", "is not";
- Filter the elements to which the contion is applied => "ERDF", "Intervention Field";
- Remember to "Apply" the filters to the dataset;
- Export the filtered dataset.
6.3 Exporting and reusing data
Once you “view data” from the metadata page the platform offers several ways to export and use the data.
6.3.1 Static export
A static export is a one off download of the data for local use. Once you download the data you have captured a shapshot and the local copy will not update.
From the metadata page the "Export" button opens a pop up box with download options. These include choices of the export format.
When exporting data file there are two typical challenges that users face if using MS Excel:
- The number and financial amounts with commas and decimal formatting are not accurately imported into Excel giving false results when checked against reference values on the platform: => See Annex 1 to correct set your default excel settings for numbers and financial amounts;
- The .csv file is not formatted in columns: => see Annex 2 to see how to reformat the .csv format in Excel columns and cells.
6.3.2 Dynamic export with an OData link
Consider using the OData. It is a permanent link between the platform and the excel file. It can be used to avoid regular manual downloads. Refreshing the data will automatically update on your excel.
6.3.3 Reuse data using a SODA API interconnector
Each published dataset on the platform is provided with an application programming interface (API) to allow direct access to the data from a variety of programming softwares.
IT developers can use the Socrata developer portal and the API documents to programmatically reuse the data directly to build filtered views and visualisations.
6.3.4 Connecting to cohesion open data using Stata
Requirements:
- Stata software
- Active internet connection
- Enable Stata to access internet
Process:
First, check the internet connection of your Stata installation. You can do this by reviewing the system
settings or running commands such as update or view "https://commission.europa.eu/index_en".
Note, working from within institutions such as the European Commission you may be required to set a proxy, port etc.
Second, be mindful of your operating system’s and Stata-specific special characters. For example, using Stata on Windows requires to escape $ with a \ when using it in a context such as an URL.
The final code to use to access the dataset e4v6-qrrq (https://cohesiondata.ec.europa.eu/2014-2020/ESIF-2014-2020-FINANCES-PLANNED-DETAILS/e4v6-qrrq) when operating in a context requiring setting a proxy, port, and user identification to access websites is:
loc un <username in browser>
loc pw <password in browser>
loc proxy <proxy_address_by_your_institution>
loc port <port_by_your_institution >
loc file <file_name_you_want_to_assign>
loc limit <limit_of_data_such_as_number_of_rows>
shell curl --proxy `un':`pw'@`proxy':`port' https://cohesiondata.ec.europa.eu/resource/e4v6-qrrq.csv?\$limit=`limit' > `file'.csv
All lines are to be run at once. You can also write all information in a single line. The code only specifies the limit parameter though other queries should function in a similar fashion.
The command ultimately opens a small popup window to operate outside of Stata (using a shell in Stata to call curl from the terminal), access the files, and store them at a location of your choice. Mind to go against intuition and omit double or single quotations.
Potential pitfalls when pursuing other approaches:
Running import delimited "https://cohesiondata.ec.europa.eu/resource/e4v6-qrrq.csv" users can access the data but may be limited to the first 1 000 rows. If using any of the provided API endpoints (so following the URL with “?$query=SELECT”…) then users can be faced with various errors (r(603) or r(672); cf. https://www.stata.com/support/faqs/web/common-connection-error-messages).
Setting the limit parameter (as suggested here: https://support.socrata.com/hc/en-us/articles/202949268-How-to-query-more-than-1000-rows-of-a-dataset) with the following code https://cohesiondata.ec.europa.eu/resource/e4v6-qrrq.csv?$limit=50000 may not be successful due to the missing escape of the $ symbol. However, this approach is helpful for when accessing the data via PowerBI.
6.4 Visualizing the data
The platform provides the possibility to make a range of interactive charts and visuals, without IT programming knowledge.
This separate data story provides a guide to building #CohesionOpenData visuals to share and reuse. It provides detailed instructions how to create charts as follows:
6.6 Share
You have the possibility to share any dataset, views or chart via social media.
6.7 About = Metadata
This link gives you access to information on the data description, usage, owner, contact and last update.
ANNEX 1: Setting the correct number format to open and read an exported “.CSV (for Excel)” file
ANNEX 2: Reformatting an exported “.CSV (for Excel)” file to display data in cell
RESOURCES AND CONTACTS
DG REGIO: Find out more about #CohesionOpenData: Website - Blog - FAQ - User Guide to ODP - User guide to visualisation - Browse data stories
Tylertech/Socrata:
- Developer Platform: https://dev.socrata.com/
Date of text: April 2022
Authors: Iluta BADERE, John WALSH
E-mail: regio-webmaster@ec.europa.eu