Dev:YaCyAPIforPHP

Aus YaCyWiki
Wechseln zu: Navigation, Suche

Introduction

Flag-germany.gif Von dieser Seite existiert auch eine deutsche Version.

Using PHPs build-in curl-functions is an easy way to steer YaCy-Java servlets from any other language or platform and get the offered xml/json result data. For a list of all available api servlets see Dev:API.
To avoid overhead an API-class for php may be included, usage is similar to the perl module ismael ismael so existing perl code can be easily re-used for php based applications. There is also a PHP framework at YaCyAPI4.

API reference

steering

getPropperties

get a collection of peer status information activecount, activelinks, activewords, passivecount, passivelinks, passivewords, potentialcount, potentiallinks, potentialwords, allcount, alllinks, allwords, authHash, pass, user, url, yourname, yourtype, yourversion, yourutc, youruptime, yourlinks, yourwords, youracceptcrawl, youracceptindex, yoursentwords, yoursenturls, yourreceivedwords, yourreceivesurls, yourppm, yourseeds, yourconnects

setProperties

set inital peer configuration url, credentials

ping

check if peer available

findFirstPeer

check list of peers for first available

pauseCrawling

pause local or remote crawl jobs

resumeCrawling

resume local or reomote crawl jobs

stop

shutdown or restart peer

getYconf

get value from yacy.conf

search

search

search peer query term

setProperties

set peer url and credentials and inializes the query with defaults url=>url, user=>username, pass=password

setFormat

set format for result data

setStartRecord

setMaximumRecords

setVerify

setContentdom setSources

setPrefermaskfilter

setUrlmaskfilter

setResource

setLr

crawling

addJob

add a new crawl job url=>url, depth=>number, intention=>text, filter=>regex, dynamic=>on/off, local=>on/off, remote=>on/off, maxcheck=>on/off, maxpages=>number, autodom=>on/off, autodomdepth=>number, recrawl=>on/off, recrawltime=>number, recrawlunit=>year/month/day/hour/minute, stopword=>on/off

Usage

Start a new instance of YaCy-class and require YaCyAPI.php

<?php
require 'YaCyAPI2.php';
$search = new YaCyAPI();

Set peer URL and user credentials for restricted steering/status.

$search->setProperties('mypeer.tld:myport','admin','password');

Select format of resulting data.

$search->setFormat('xml');

Set standards for search.

$search->setStartRecord('1')
       ->setMaximumRecords('15')
       ->setVerify('true');

Start a search.

$results = $search->search('test');

Howto

This article is using PHP5 and a webserver like Apache to explain the steering and quering of a YaCy peer without using the buildt-in admin-interface.

Use PHP and Apache with Linux

Use PHP and Apache with Windows

The EasyPHP package is offering a powerful Apache/PHP/MySQL developement kit for Windows, easy to install and perfect for small web applications like YaCymin. After download and installaltion the PHP extension cURL must be enbabled in the configuration menu and the contents of YaCymin package copied to ./www. The application is now ready at http://localhost:80

Use PHP and Apache with MacOS

Call YaCy servlet using native PHP and cURL

This sample is pausing local crawls as if you would klick the appropriate button while using the admin console.

   <?php
      $YaCyURL="myadress.tld:myport";
      $appID="admin:password";
      $command="Status.html?pauseCrawlJob=&jobType=localCrawl"    #pause local crawl

      $cu=$YaCyURL.$command;    
      $queryServer = curl_init($cu);
      curl_setopt($queryServer, CURLOPT_HEADER, 0);
      curl_setopt($queryServer, CURLOPT_RETURNTRANSFER, 1);
      curl_setopt($queryServer, CURLOPT_USERPWD,$appID);

      //holder for delivered content
      $holder = curl_exec($queryServer);
    
      # done
      curl_close($queryServer);
   ?>

Parse XML result

This sample is calling network.xml and parsing the delivered xml containing informations about peers in a YaCy network.

   <?php
      $YaCyURL="myadress.tld:myport";
      $appID="admin:password";
      $command="Network.xml?page=1"    #get peerlist 

      $cu=$YaCyURL.$command;    
      $queryServer = curl_init($cu);
      curl_setopt($queryServer, CURLOPT_HEADER, 0);
      curl_setopt($queryServer, CURLOPT_RETURNTRANSFER, 1);
      curl_setopt($queryServer, CURLOPT_USERPWD,$appID);

      $xml = curl_exec($queryServer);
    
      # done
      curl_close($queryServer);

      $resultarray=xml2array($xml);

      $items=$resultarray['peers']['peer'];
      if ($items)
      {
       echo "<table>";
       foreach ($items as $item)
       {
        echo "<tr>";
        echo "<td>".$item['hash']."</td>";
        echo "<td>".$item['fullname']."</td>";
        echo "<td>".$item['version']."</td>";
        echo "<td>".$item['ppm']."</td>";
        echo "<td>".$item['qph']."</td>";
        echo "<td>".$item['uptime']."</td>";
        echo "<td>".$item['links']."</td>";
        echo "<td>".$item['words']."</td>";
        echo "<td>".$item['rurls']."</td>";
        echo "<td>".$item['lastseen']."</td>";
        echo "<td>".$item['sendWords']."</td>";
        echo "<td>".$item['receivedWords']."</td>";
        echo "<td>".$item['sendURLs']."</td>";
        echo "<td>".$item['receivedURLs']."</td>";
        echo "<td>".$item['type']."</td>";
        echo "<td>".$item['direct']."</td>";
        echo "<td>".$item['acceptcrawl']."</td>";
        echo "<td>".$item['dhtreceive']."</td>";
        echo "<td>".$item['rankingreceive']."</td>";
        echo "<td>".$item['location']."</td>";
        echo "<td>".$item['seedurl']."</td>";
        echo "<td>".$item['age']."</td>";
        echo "<td>".$item['seeds']."</td>";
        echo "<td>".$item['connects']."</td>";
        echo "<td>".$item['address']."</td>";
        echo "<td>".$item['useragent']."</td>";
        echo "</tr>"; 
       }
       echo "</table>";
      }
   ?>

Parse JSON

   <?php
      $json = new Services_JSON();
      $json = $json->decode($body);
               
      $ll=$json->loglines;
      if (!$ll) break;
                        
      foreach ($ll as $logline)
      {
        echo $logline->logline."\n";
        echo $logline->loglevel." ";  
      }
  ?>

Use the YaCy API

To shorten up repeating tasks, the php class offers most basic steering functions, and allows easy crawls starts or extended searches.
For using this class the php must be included and at least a peer adress/port has to be defined.

<?php
 // Include the API PHP Library
 require 'YaCyAPI2.php';
 // start the class 
 $search = new YaCyAPI();

Now severel optional paramters may be set using setProperties or status information may be queried with getProperties.

Sample Application: YaCymin

This sample is using just a few basic YaCy applets for steering and querying a YaCy peer.

Yacymin scn index2.jpg
  • Dev:APIStatus is used for basic steering like crawl pause/resume and restart/shutdown.
Yacymin scn crawl start.jpg

Application index

Each YaCymin applet is called inside this main index file selecting the action paramter.

<?php
 // Include the API PHP Library
 require 'YaCyAPI2.php';
 include 'JSON.php';

 // start the class 
 $search = new YaCyAPI();

 //html headers
 require 'header_inc.php'; 
 
 $action=$_GET['action'];
 switch ($action)
 {
  case "peers":
  default:
  require 'inc_peers.php';
  break;
 
  case "peerlist":
  require 'inc_peerlist.php';
  break;
  
  case "search":
  require 'inc_search.php';
  break;

} #endswitch

require 'footer_inc.php';
echo "</body></html>";

At first all css, javascript and php includes are loaded and main menu is shown using header_inc.php.

Getting peer status

This applet is called for a quick status overview showing all peers defined in peerlist_inc.php.

 // show overview
 echo "<h3><font color=grey> ".date('d-m H:i',time())."</font></h3>";
  
 include 'peerlist_inc.php';
 
 $maxpeers=count($this_YaCyPeer)+1;
 
 for ($peerno=0;$peerno<$maxpeers;$peerno++)
 {
 
  $peer=$this_YaCyPeer[$peerno][0];
  $port=$this_YaCyPeer[$peerno][1];
  $appid=$this_YaCyPeer[$peerno][2];
  $name=$this_YaCyPeer[$peerno][3];

  #$info=$search->peerStatus($peerno);                #get peerinfo from  internal list  
  $res=$search->setProperties($peer.":".$port,$appid,$name);
  $info=$search->ping();
  
  if ($info['host'])     #peer defined?
  {
   $t="Showing peer-name / address and time for a ping / time for executing commands / peer uptime. Click for more infos.";
   echo '<div class="blogtoy"><h2 title=" '.$t.' "class="widgetheader">';
   
   if ($info['status'])                                             #peer online
    {
      #$items=$search->peerInfo($peerno);   #extended info     
      $items=$search->getProperties();
      $info['name']=$items['name'];
      
      echo "<table border=0 width=100%><tr><td width=50%>";     
      if ($info['name'])
      {
        echo "<i>".$info['name']."</i> is ";
      } else {
        echo $info['host'].":".$info['port']." is ";     
      } 
          
      echo "<font color=green>on</font>";
      echo "</td><td>";
      $ti1=round(($search->_dur1*1000),0);
      $ti2=round(($search->_dur2*1000),0);
      switch ($ti1)
      {
      case ($ti1 > 300):
        $ti1="<font color=red>".$ti1."</font>";
      break;      
      case ($ti1 > 100):
        $ti1="<font color=orange>".$ti1."</font>";
      break;      
      }    
      switch ($ti2)
      {
      case ($ti2 > 5000):
        $ti2="<font color=red>".$ti2."</font>";
      break;      
      case ($ti2 > 500):
        $ti2="<font color=orange>".$ti2."</font>";
      break;      
      }
      
      echo "<font size=1> (<font color=grey>ping:</font>".$ti1." <font color=grey>cmd:</font>".$ti2." ms)</font></td>"; 
      echo "<td align = right>";      
      echo " ".$items['uptime']."   ";     
      echo "</td></tr></table>";
     } 
     else 
     { 
        if ($info['name'])
        {
          echo "<i>".$info['name']."</i> is ";
        } else {
           echo $info['host'].":".$info['port']." is ";     
        } 

        echo "<font color=red>off";
        echo "</font>"; 
        echo ' <a href=?action=SSH&host='.$info['host'].' title="SSH"><img src=images/cats.png></a>';
     } 
     
     echo '</h2> <div class="innerwidget">';
     if ($info['status']) {
       echo "<div align=center><h4>Peer Details:</h4>";       
       echo $items['type']." ".$items['name']." (".$info['host'].":".$info['port'].")";
       echo " subversion:" .$items['version'];
       echo "<br>links/words: ".$items['links']."/".$items['words']." - ";
       if ($items['acceptindex']) echo "accepts DHT-in ";
       if ($items['acceptcrawl']) echo "is remote crawling";
 
      echo "<br><a href=http://".$info['host'].":".$info['port']."/Status.html>Admin</a>";
      echo " - <a href=?action=log&peer=".$peerno.">Log</a>";
      echo " - <a href=?action=SSH&host=".$info['host'].">SSH</a>";
 
      if ($ym_mode=="admin"){
       echo " - <a href=http://".$info['host'].":".$info['port']."/Steering.html?restart=><font color=orange>Restart</font></a>"; 
       echo " - <a href=http://".$info['host'].":".$info['port']."/Steering.html?shutdown=1><font color=red>Shutdown</font></a>"; 
      }  
 
      echo "<br><img src=http://".$info['host'].":".$info['port']."/Banner.png?textcolor=000000&bgcolor=ddeeee&bordercolor=aaaaaa>";   
      echo "</div>"; 
 
      echo '<div class="blogtoy">';
	    echo '<h2 class="widgetheader"><font>Performance...</font></h2>';
	    echo '<div class="innerwidget">';
 
      echo '<iframe src="http://'.$info['host'].":".$info['port'].'/rssTerminal.html?set=PEERNEWS,REMOTESEARCH,LOCALSEARCH,REMOTEINDEXING,LOCALINDEXING,INDEXRECEIVE&width=600px&height=180px&maxlines=20&maxwidth=120"
              style="width:600px;height:180px;margin:0px;border:1px solid black;" scrolling="no" name="newsframe"></iframe><br />';
              
      echo "<p><img src=http://".$info['host'].":".$info['port']."/PerformanceGraph.png></p>";  
      echo "<hr>";
      echo '</div></div>';
   }
   else
   {
      echo "Offline";
   }
   echo "</div></div>";   #blogtoy
   echo "</div>";

 } # end if peer defined  

} #end for peerno

Getting Network status


echo "<h3><font color=grey> Active peers in network </font></h3>";

include 'peerlist_inc.php';

$command="Network.xml?page=1";
$peerno="0";
$peer=$this_YaCyPeer[$peerno][0];
$port=$this_YaCyPeer[$peerno][1];
$appid=$this_YaCyPeer[$peerno][2];
$name=$this_YaCyPeer[$peerno][3];
$peername="http://".$peer.":".$port."/";

$res=$search->setProperties($peer.":".$port,$appid,$name);
$info=$search->ping();
  
if ($info['host'])     #peer defined?
{

$results = $search->peerCommandDirect($peername,$command); 

//now we have xml, put it in a simple array
$resultarray=xml2array($results);  #, $get_attributes = 1, $priority = 'tag');
  
 //get items only
$items=$resultarray['peers']['peer'];
if ($items)
{
  echo "<table>";
  foreach ($items as $item)
  {
   if ($tr=="ffffff") {$tr="aaaaaa";} else {$tr="ffffff";}
   
   echo "<tr bgcolor=#".$tr.">";
   #echo "<td>".$item['hash']."</td>";
   echo "<td>".$item['fullname']."</td>";
   echo "<td>".$item['type']."</td>";
   echo "<td>".$item['version']."</td>";
   echo "<td>".$item['ppm']."</td>";
   echo "<td>".$item['qph']."</td>";
   echo "<td>".$item['uptime']."</td>";
   echo "<td>".$item['links']."</td>";
   echo "<td>".$item['words']."</td>";
   #echo "<td>".$item['rurls']."</td>";
   #echo "<td>".$item['lastseen']."</td>";
   #echo "<td>".$item['sendWords']."</td>";
   #echo "<td>".$item['receivedWords']."</td>";
   #echo "<td>".$item['sendURLs']."</td>";
   #echo "<td>".$item['receivedURLs']."</td>";
   
   echo "<td>".$item['direct']."</td>";
   echo "<td>".$item['acceptcrawl']."</td>";
   echo "<td>".$item['dhtreceive']."</td>";
   echo "<td>".$item['rankingreceive']."</td>";
   # echo "<td>".$item['location']."</td>";
   # echo "<td>".$item['seedurl']."</td>";
   echo "<td>".$item['age']."</td>";
   #echo "<td>".$item['seeds']."</td>";
   #echo "<td>".$item['connects']."</td>";
   echo "</tr>"; 
  }
  echo "</table>";
  
} #end for

} #end if peer online

Getting crawl profiles

Start new crawls