Data
This page presents some views of the aggregate data collected about the AI-generated analyses of headlines, broken down by model and by media source. While individual analyses may hit or miss the mark, in the aggregate I think it's easier to understand overall model performance.
Each section includes an explanation of the data collected, the relevant task instructions given to the AI, and some general observations about the results. Data on this page will be updated daily as new analyses are generated.
Analysis Quality by Model
HeadlineWise sends each headline to one of several different generative models for analysis. Models are prompted with identical instructions for the analysis. Each analysis has three parts: language categorization, political bias categorization, and a written explanation of how the model arrived at its conclusions, quoting directly from the analyzed headline if possible.
A few notes about my analysis review methods:
- I compare the generated analysis to the headline in the UI, where only the headline itself and the analysis are displayed. The news source is hidden. I accept or reject each analysis as a whole.
- I mark a generated analysis as accepted if it passes my common sense test: language and politics categories seem reasonable (even if other interpretations are possible) and the logic in the analysis text is cogent and a reasonable interpretation of the headline text.
- I mark an analysis as rejected if something seems obviously off: categorization is very obviously wrong, the explanation does not seem connected to the text, or the written explanation includes hallucinated quotes not in the headline.
And a few limitations to be aware of:
- Reviews are conducted by me alone, so there's still some subjectivity to this process, although I try to be consistent.
- I currently accept or reject the analysis as a whole, but in the future I would also like to collect more data on reasons for rejection to understand which types of tasks are more difficult for the models.
- When viewing aggregate data per media source below, keep in mind that HeadlineWise selectively queries the News API for specific topics; the headlines analyzed are not a representative sample of all the content published by that source.
Language Analysis by Media Source
As part of each analysis, the model is instructed to categorize the headline's phrasing using one or more of the following categories: "objective, subjective, inflammatory, emotionally provocative." Some analyses have come back with additional categories generated by the model, including modified variations on the categories in the system prompt.
Political Analysis by Media Source
As part of each analysis, the model is instructed to classify the perspective of the headline writer along the following scale: "very right-leaning, slightly right-leaning, neutral, slightly left-leaning, very left-leaning (in the context of American politics)."
In my review, this category is often the area of the analysis I'm most inclined to disagree with, but I've made a rule to accept the categorization if it seems like a reasonable interpretation of the text (even if other interpretations are possible). The issue seems to arise particularly when the headline comments on or frames a quote by a political actor, or (to me) seems to be phrased sarcastically or insincerely, or focuses coverage on something I believe to be a politically-charged pet issue that's statistically uncommon.
One complication I've observed is that certain news organizations sometimes include their name in the text of the headline. Although the system prompt instructs the model to ignore names of publications or media outlets when generating analysis, the models seem to ignore this instruction, at least some of the time. I've specifically observed a few cases where the model explicitly bases their political classification on this reference, for example:
- Several headlines that included "Breitbart" or even the truncated "Breit..." were classified as right-leaning because Breitbart "is known to be a conservative organization"
- A headline that included "CNN" was classified as slightly left-leaning because CNN "is a slightly left-leaning" organization
In the future I may introduce some text filtering to try to mitigate against this bias and see if the results change, but for now all headlines are sent for analysis as published.