Topic modelling helps publisher deliver what readers want

Comment Email Print

Finding out what readers read, what keeps them coming back, and what inspires them to subscribe has been key to a project by Berlin-headquartered Funke Media Group.

"We wanted to answer these questions to get to know our readers better, to find out what is important to them and what they need," says interactive team head Marie-Louise Timcke.

"We also wanted to find out what we can do to achieve greater reach, more subscriptions, or to strengthen subscriber loyalty to our products."

In an INMA Ideas Blog post, she tells how the team developed its own analysis to identify which topics worked well or less well, and whose output should be increased or reviewed in the areas of reach, paid content, and retention through full-text topic modelling of articles. "The methodology is based on various text clustering techniques and finds finer topic groups than internal re-sorts, sections, or tags allow," she says.

Funke Media Group developed its own way to analyse page views in each category.

"We then evaluate the performance of these topic groups in the areas of reach, paid, and retention in comparison to the costs, meaning the amount of published articles on the topic. In this way, we can identify topics for which we generate a lot of content but that are not well received by our readers, or topics for which we have little content but are of great interest to our users."

Stories were analysed for future coverage - which topics work well on reach, which ones are users willing to pay for, and which ones do subscribers read regularly?

"One way to analyse this is to look at the page views and subscriptions for each category," says Timcke.

"Each article is assigned to a category by the journalist who wrote it, for example sports or politics. The problem with these sections is that they are very broad. Is a reader who reads an article about Angela Merkel interested in politics in general? Or is she interested in the topic of rent, on which Angela Merkel is quoted in the article?

"Without a more detailed tagging of the texts in terms of sub-topics or even emotions reflected in the article, the challenge is to identify the specific topics of the articles. We used different topic modelling and text clustering algorithms to achieve this goal."

Topic modelling creates a network of marker words that often appear together in articles.

In this simple example, a network of marker words that often appear together in articles and can therefore be combined to form topic clusters. "For example, many articles that contained the word 'goal' also contained the word 'soccer' or 'team'.

"There are occasional cross-references between the clusters because there are marker words that occur in several clusters, such as the word 'fan', which belongs to both the soccer and music clusters."

She says text clustering algorithms can be used to identify the marker word clusters, with the programme evaluating all articles and dividing them into groups so that all articles in a group have as much in common as possible (frequently occurring marker words), while the groups differ from each other as much as possible (have as few links to other clusters as possible).

The search process was refined by applying this topic modelling in several steps, by searching again for clusters within the found topic groups. "Instead of sorting our articles only according to whether they deal with the topic of soccer, we found, for example, groups of articles that explicitly dealt with emotional events on the soccer field instead of just reporting on matches," she says.

Using a grid, Funke Media Group can interpret the success of different articles.

"Being able to assign the articles to the most detailed topic groups possible was the basis for analysing our topic performance.

"We compared the output - the number of published articles per topic cluster - with the number of page views, subscriptions, or page views of subscribers. To interpret the results, we used a grid that divides the topic groups into successful and less successful."

Timcke says Funke Media were inspired by a content portfolio framework from Amedia, which allowed them to identify topics on which few articles were published, but were read a lot, thus showing untapped potential.

"An increase in output could also mean an increase in page impressions. Likewise, topics can be identified where the interest of the readers remains very low despite high outputs. A closer look is worthwhile here: Should the output be reduced or the type of reporting changed?"

Using a standardised analysis to individually analyse the digital content portfolio of each newspaper title made it possible to draw up concrete instructions and tips for adapting the portfolio personalised for each individual news platform as well as separated by target area (more reach, more subscriptions, better retention). The visualisations and on-site presentations helped communicate these instructions and tips to the newsroom, and in combination with daily newsletter and paid content trainings, has had a big impact on how newsrooms now evaluate topics before deciding on how to cover them.

Read more from:
Comment Email Print
Powered by Bondware
News Publishing Software

The browser you are using is outdated!

You may not be getting all you can out of your browsing experience
and may be open to security risks!

Consider upgrading to the latest version of your browser or choose on below: