Php => Rss

hub
Invité n'ayant pas de compte PHPfrance

30 juin 2006, 18:21

Bonjour,
j'ai trouvé un script pas mal sur internet qui permet de transformer votre page php en page pour flux RSS :
<?
// This program is public domain. Do with this what you want.
// It is derived from Gnews2rss (http://www.voidstar.com/gnews2rss.php)
// by Barijaona Ramaholimihaso
//
// Disclaimer. Don't expect this to be here, to work, or to get fixed.
// But if you have a question or comment, email [email protected]
//


// ----------------------------------------------------------------------------
// personalize these settings according to your needs

// URL to parse
  $url = "http://www.lemonde.fr/txt/sequence/0,2-3208,1-0,0.html";

// do not scrap before this text
  $ignore_before = "<a name=vers_tete>";

// encoding
  $encoding = "iso-8859-1";

// description
  $channel_description = "La Une du Monde";

// language
  $channel_language = "fr-fr";

// regular expression pattern, and positions 
// of interesting stuff delimited by parenthesis in the pattern
  $itemregexp = "%<a href=\'(.+?)\'[^>]*>(.+?)</a>([^<]*)%i";

  $url_match_number = 1;
  $title_match_number = 2;
  $desc_match_number = 3;

// This is used to suppress some tags and makes writing the search pattern easier
  $searchable_tags = "<A><B><BR><BLOCKQUOTE><CENTER><DD><DL><DT><HR><I><IMG><LI><OL><P><PRE><U><UL>"; 

//------------------------------------------------------------------------------

  $allowable_tags = "<A><B><BR><BLOCKQUOTE><CENTER><DD><DL><DT><HR><I><IMG><LI><OL><P><PRE><U><UL>";

  header("Cache-Control: public");

// When debugging, make the following line a comment
  header("Content-Type: text/xml");

  preg_match("/(http:\/\/([^\/]*))/i", $url, $matches);
  $root = $matches[1]."/";

  if ($fp = @fopen($url, "r")) {
    while (!feof($fp)) $data .= fgets($fp, 128);
    fclose($fp);
  }

// *******************

// Debug stuff : comment out the content-type header above
// and uncomment the following lines to see what the site is returning.
//  print "<html>";
//  print "<pre>";
//  print htmlentities($data);

  eregi("<title>(.*)</title>", $data, $title);

  $channel_title = $title[1];

// trash the text before the $ignore_before text
  $data = strstr($data,$ignore_before);

// Debug stuff;
//  print htmlentities($data);

// suppress some tags and makes writing the search pattern easier
  $data = strip_tags($data, $searchable_tags);

// Debug stuff;
//  print htmlentities($data);

  $match_count = preg_match_all($itemregexp, $data, $items);
  $match_count = ($match_count > 70) ? 70 : $match_count;

  $output .= "<?xml version=\"1.0\" encoding=\"$encoding\" ?>\n";
  $output .= "<!DOCTYPE rss >\n";

  $output .= "<rss version=\"2.0\">\n";
  $output .= "  <channel>\n";
  $output .= "    <title>$channel_title</title>\n";
  $output .= "    <link>". htmlentities($url) ."</link>\n";
  $output .= "    <description>$channel_description</description>\n";
  $output .= "    <webMaster>[email protected]</webMaster>\n";
  $output .= "    <language>$channel_language</language>\n";
  $output .= "    <generator><a href=\"http://www.voidstar.com/gnews2rss.php\">GNews2Rss</a></generator>\n";

  for ($i=0; $i< $match_count; $i++) {

    $item_url = $items[$url_match_number][$i];
    $item_url = htmlspecialchars($item_url, ENT_QUOTES, $encoding);
    $item_url = preg_replace("%^/%", $root, $item_url);

    $title = $items[$title_match_number][$i];
    $title = strip_tags($title);
    $title = htmlspecialchars(html_entity_decode($title, ENT_QUOTES, $encoding), ENT_QUOTES, $encoding);
    $title = preg_replace("/&#([0-9]+);/", "&#\${1};", $title);

    $desc = $items[$desc_match_number][$i];
    $desc = strip_tags($desc, $allowable_tags);
    $desc = htmlspecialchars(html_entity_decode($desc, ENT_QUOTES, $encoding), ENT_QUOTES, $encoding);
    $desc = preg_replace("/&#([0-9]+);/", "&#\${1};", $desc);

    $output .= "    <item>\n";
    $output .= "      <title>". $title ."</title>\n";
    $output .= "      <link>". $item_url ."</link>\n";
    $output .= "      <description>". $desc ."</description>\n";
    $output .= "    </item>\n";
  }

  $output .= "  </channel>\n";
  $output .= "</rss>\n";

  print $output;

?>
J'ai bien configuré l'adresse de la page à transformer, seulement ce que je ne sais pas faire c'est délimiter le texte à encoder (je n'arrive pas délimiter mes news avec des expressions régulières malgrès les nombreux tutoriaux)
Toutes les "articles" sur ma page php à encoder en RSS commencent par "//debut" et terminent pas "//fin" comme ceci :
//debut
Mon article
//fin
//debut
Mon article
//fin

Pouvez-vous m'aider ? merci d'avance pour votre réponse