Content Mine: scientific literature exploration through text mining

In this episode, we interviewed Peter Murray Rust, chemist at Cambridge University. Peter is also known for his work and support related to open access and open data, among his projects is the Content Mine software chain about which we talked in this episode. The Content Mine group currently offer and maintain these open source software, but it also offers consulting services to assist individuals or groups interested in the suite of software.

Content Mine is a suite of open source software designed to mine and analyze the scientific literature. Three packages are currently offered by the Content Mine group: getpapers, ami and norma. These 3 packages should allow us to download large sets of papers about a certain subject, normalize the obtained data to better explore it and then start analyzing using basic tools such as word counts and regular expressions. We explored and discussed these packages and how they could serve a researcher. You will also learn about the history of ContentMine, its team and the opinion of publishers, such as Elsevier, regarding such practices.