A necessary dialogue about text and data mining (TDM) in Europe

Text by Christina Lenz, Managing Editor, Stockholm University Press

There is, more or less, a consensus about why TDM is important in Europe and that we need to collaborate towards common goals and actions, but there are many challenges towards those goals.

The TDM landscape is extremely diversified, mostly because it is constantly and rapidly changing. The stakeholders are many and they sometimes have different agendas. At present there are no European coordination or clear guidelines in the EU. This is a necessity and we need to collaborate more.

Sketch by Elco van Staveren 11/11/15 CC-BY
Sketch by Elco van Staveren 11/11/15 CC-BY

Induce changes in the copyright law

One of the biggest challenges is to induce changes in the copyright law which The Hague Declaration states. A copyright exception for TDM will enable libraries and their users to contribute to an innovative and competitive Europe. (Read SUPs blogpost Imagine free data mining.)

Commercial publishers protect their copyright and mostly don’t have open data. Some publishers are developing their own databases and even have search engines for text and data which is usually not available to the general public or researchers. Therefore another challenge is to find ways for Libraries in Europe to negotiate jointly with these publishers to make text and data open for mining.

Technical, organisational and financial challenges

Major challenges are technical, such as developing software tools and algorithms, APIs (application programming interface) and i.e. issues regarding OCR (Optical Character Recognition) etc. There are also organisational and financial challenges for all involved when new investments need to be made in knowledge, education and technology.

Solutions and actions

There are a variety of initiatives going on at institutions, national libraries and by other stakeholders, i.e. OpenMinTeD, Europeana and LIBER. These, and surely many other projects going on in Europe, have resulted in or are striving towards interesting solutions and good experiences that we can all learn from. These experiences need to be highlighted as good examples and actions of why TDM is important and necessary, but also as how we can overcome some of the challenges we are facing together.

In November I attended the workshop “Text and Data Mining in Europe: Defining the Challenges and Actions” (http://www.theeuropeanlibrary.org/tel4/newsitem/9500) organized by The European Library, in collaboration with OpenMinTeD, Europeana, and LIBER. The organisers did succeed in contributing to a necessary dialogue about TDM, which we all need to be part of. So, to cite Max Kaiser: “Let’s talk!” Or rather: “let’s keep talking” (See also Max Kaiser’s presentation.

Next week I will write a blog post about how we at The Stockholm University Library work and promote for text and data mining.

What challenges do you see with text and data mining in your country and your organisation?

Read more about text and data mining at TDM Factsheet LIBER.


One thought on “A necessary dialogue about text and data mining (TDM) in Europe

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s