For a non-commercial private school project I'm creating a piece of software that will search for lyrics based on what song currently is playing on Spotify. I have to do this in C# (requirement), but I can use other languages if I so desire.
I've found a few sites that I can use to fetch the lyrics from. I have already succeeded in fetching the entire html code, but after that I'm not sure what to do. I've asked my teacher, she told me to use XML (which I also found complicated :p), so I've read quite a bit about it and searched for examples, but haven't found anything that seems applicable to my case.
<p class="mxm-lyrics__content" data-reactid="200">First line of the lyrics!
These words will never be ignored
I don't want a battle
<!-- react-empty: 201 -->
<div class="inline_video_ad_container_container" data-reactid="203">
<div id="inline_video_ad_container" data-reactid="204">
<div class="" style="line-height:0;" data-reactid="205">
<div id="div_gpt_ad_outofpage_musixmatch_desktop_lyrics" data-reactid="206">
//Really nice google ad JS which I have removed;
<p class="mxm-lyrics__content" data-reactid="207">But I got a war
More fancy lyrics
That I want to fetch
string source = "https://www.musixmatch.com/lyrics/Bullet-for-My-Valentine/You-Want-a-Battle-Here’s-a-War";
// The HtmlWeb class is a utility class to get the HTML over HTTP
HtmlWeb htmlWeb = new HtmlWeb();
// Creates an HtmlDocument object from an URL
HtmlAgilityPack.HtmlDocument document = htmlWeb.Load(source);
// Targets a specific node
HtmlNode someNode = document.GetElementbyId("mxm - lyrics__content");
if (someNode != null)
foreach (var node in document.DocumentNode.SelectNodes("//span/div[@id='site']/p[@class='mxm-lyrics__content']"))
// here is your text: node.InnerText "//div[@class='sideInfoPlayer']/span[@class='wrap']"
One of the solutions
var htmlWeb = new HtmlWeb(); var documentNode = htmlWeb.Load(source).DocumentNode; var findclasses = documentNode.Descendants("p") .Where(d => d.Attributes["class"]?.Value.Contains("mxm-lyrics__content") == true); //or var findclasses = documentNode.SelectNodes("//p[contains(@class,'mxm-lyrics__content')]") var text = string.Join(Environment.NewLine, findclasses.Select(x => x.InnerText));