For example, RSS has:
- <description>
- <media:description>
- <media:text>
all of which can contain the "right" description for a certain feed item.
A site like CNN.com will have its description data in the usual <description> tag (this is the preferable and most used scenario):
Code: Select all
<item>
<title><![CDATA[Trump predicts 'very good chance' of China trade deal ]]></title>
<description><![CDATA[President Donald Trump expressed optimism on a trade deal with China, after meeting with Chinese Vice Premier Liu He in the Oval Office on Friday afternoon. ]]></description>
<link>https://www.cnn.com/2019/02/22/politics/trump-china-trade-talks/index.html</link>
<guid isPermaLink="true">https://www.cnn.com/2019/02/22/politics/trump-china-trade-talks/index.html</guid>
<pubDate>Sat, 23 Feb 2019 07:05:12 GMT</pubDate>
<media:group><media:content medium="image" url="https://cdn.cnn.com/cnnnext/dam/assets/190222152628-02-donald-trump-liu-he-super-169.jpg" height="619" width="1100" /><media:content medium="image" url="https://cdn.cnn.com/cnnnext/dam/assets/190222152628-02-donald-trump-liu-he-large-11.jpg" height="300" width="300" /><media:content medium="image" url="https://cdn.cnn.com/cnnnext/dam/assets/190222152628-02-donald-trump-liu-he-vertical-large-gallery.jpg" height="552" width="414" /><media:content medium="image" url="https://cdn.cnn.com/cnnnext/dam/assets/190222152628-02-donald-trump-liu-he-video-synd-2.jpg" height="480" width="640" /><media:content medium="image" url="https://cdn.cnn.com/cnnnext/dam/assets/190222152628-02-donald-trump-liu-he-live-video.jpg" height="324" width="576" /><media:content medium="image" url="https://cdn.cnn.com/cnnnext/dam/assets/190222152628-02-donald-trump-liu-he-t1-main.jpg" height="250" width="250" /></media:group>
</item>
Code: Select all
<item>
<title><![CDATA[Hazard deal goes cold thanks to FIFA... And Real Madrid]]></title>
<description><![CDATA[ <a href="https://www.marca.com/en/football/real-madrid/2019/02/23/5c7079d7e2704ee9b38b45a2.html"> Leer </a><img src="http://secure-uk.imrworldwide.com/cgi-bin/m?cid=es-widgetueditorial&cg=rss-marca&ci=es-widgetueditorial&si=https://e00-marca.uecdn.es/rss/en/index.xml" alt=""/>]]></description>
<dc:creator><![CDATA[marca.com]]></dc:creator>
<link>https://www.marca.com/en/football/real-madrid/2019/02/23/5c7079d7e2704ee9b38b45a2.html</link>
<media:description type="html"><![CDATA[A potential transfer to take <strong>Eden Hazard to Real Madrid</strong> looks less likely than ever at present, with a number of factors completely changing the complex of the sit]]></media:description>
<media:title type="html"><![CDATA[REAL MADRID|Unsure after form of Vinicius and Rodrygo's arrival]]></media:title>
<media:content url="https://e00-marca.uecdn.es/assets/multimedia/imagenes/2019/02/23/15508796541256.jpg" medium="image" width="650" height="366" />
<media:thumbnail url="https://e00-marca.uecdn.es/assets/multimedia/imagenes/2019/02/23/15508796541256_150x0.jpg" width="150" height="84" />
<guid>https://www.marca.com/en/football/real-madrid/2019/02/23/5c7079d7e2704ee9b38b45a2.html</guid>
<pubDate>Sat, 23 Feb 2019 01:11:14 +0100</pubDate>
</item>
Do you have any idea how to choose the "right" tag to look into? How do YOU do it, in your feed skin? Is it something like "if <media:description> tag exists (or it has attributes), then this is where you should look for", or is it something like "just check all the related tags and choose the one with the most content (or something like that)"? How the heck do other RSS parsers pick the correct content in this? Or do they have to perform multiple checks and "guess" things as well?