Dev:YaCyAPIforPHP
Inhaltsverzeichnis
Introduction
Von dieser Seite existiert auch eine deutsche Version.
Using PHPs build-in curl-functions is an easy way to steer YaCy-Java servlets from any other language or platform and get the offered xml/json result data. For a list of all available api servlets see Dev:API.
To avoid overhead an API-class for php may be included, usage is similar to the perl module ismael ismael so existing perl code can be easily re-used for php based applications. There is also a PHP framework at YaCyAPI4.
API reference
steering
getPropperties |
get a collection of peer status information | activecount, activelinks, activewords, passivecount, passivelinks, passivewords, potentialcount, potentiallinks, potentialwords, allcount, alllinks, allwords, authHash, pass, user, url, yourname, yourtype, yourversion, yourutc, youruptime, yourlinks, yourwords, youracceptcrawl, youracceptindex, yoursentwords, yoursenturls, yourreceivedwords, yourreceivesurls, yourppm, yourseeds, yourconnects |
setProperties |
set inital peer configuration | url, credentials |
ping |
check if peer available | |
findFirstPeer |
check list of peers for first available | |
pauseCrawling |
pause local or remote crawl jobs | |
resumeCrawling |
resume local or reomote crawl jobs | |
stop |
shutdown or restart peer | |
getYconf |
get value from yacy.conf |
search
search |
search peer | query term |
setProperties |
set peer url and credentials and inializes the query with defaults | url=>url, user=>username, pass=password |
setFormat |
set format for result data | |
setStartRecord |
||
setMaximumRecords |
||
setVerify |
||
setContentdom setSources |
||
setPrefermaskfilter |
||
setUrlmaskfilter |
||
setResource |
||
setLr |
crawling
addJob |
add a new crawl job | url=>url, depth=>number, intention=>text, filter=>regex, dynamic=>on/off, local=>on/off, remote=>on/off, maxcheck=>on/off, maxpages=>number, autodom=>on/off, autodomdepth=>number, recrawl=>on/off, recrawltime=>number, recrawlunit=>year/month/day/hour/minute, stopword=>on/off |
Usage
Start a new instance of YaCy-class and require YaCyAPI.php
<?php require 'YaCyAPI2.php'; $search = new YaCyAPI();
Set peer URL and user credentials for restricted steering/status.
$search->setProperties('mypeer.tld:myport','admin','password');
Select format of resulting data.
$search->setFormat('xml');
Set standards for search.
$search->setStartRecord('1') ->setMaximumRecords('15') ->setVerify('true');
Start a search.
$results = $search->search('test');
Howto
This article is using PHP5 and a webserver like Apache to explain the steering and quering of a YaCy peer without using the buildt-in admin-interface.
Use PHP and Apache with Linux
Use PHP and Apache with Windows
The EasyPHP package is offering a powerful Apache/PHP/MySQL developement kit for Windows, easy to install and perfect for small web applications like YaCymin. After download and installaltion the PHP extension cURL must be enbabled in the configuration menu and the contents of YaCymin package copied to ./www. The application is now ready at http://localhost:80
Use PHP and Apache with MacOS
Call YaCy servlet using native PHP and cURL
This sample is pausing local crawls as if you would klick the appropriate button while using the admin console.
<?php $YaCyURL="myadress.tld:myport"; $appID="admin:password"; $command="Status.html?pauseCrawlJob=&jobType=localCrawl" #pause local crawl $cu=$YaCyURL.$command; $queryServer = curl_init($cu); curl_setopt($queryServer, CURLOPT_HEADER, 0); curl_setopt($queryServer, CURLOPT_RETURNTRANSFER, 1); curl_setopt($queryServer, CURLOPT_USERPWD,$appID); //holder for delivered content $holder = curl_exec($queryServer); # done curl_close($queryServer); ?>
Parse XML result
This sample is calling network.xml and parsing the delivered xml containing informations about peers in a YaCy network.
<?php $YaCyURL="myadress.tld:myport"; $appID="admin:password"; $command="Network.xml?page=1" #get peerlist $cu=$YaCyURL.$command; $queryServer = curl_init($cu); curl_setopt($queryServer, CURLOPT_HEADER, 0); curl_setopt($queryServer, CURLOPT_RETURNTRANSFER, 1); curl_setopt($queryServer, CURLOPT_USERPWD,$appID); $xml = curl_exec($queryServer); # done curl_close($queryServer); $resultarray=xml2array($xml); $items=$resultarray['peers']['peer']; if ($items) { echo "<table>"; foreach ($items as $item) { echo "<tr>"; echo "<td>".$item['hash']."</td>"; echo "<td>".$item['fullname']."</td>"; echo "<td>".$item['version']."</td>"; echo "<td>".$item['ppm']."</td>"; echo "<td>".$item['qph']."</td>"; echo "<td>".$item['uptime']."</td>"; echo "<td>".$item['links']."</td>"; echo "<td>".$item['words']."</td>"; echo "<td>".$item['rurls']."</td>"; echo "<td>".$item['lastseen']."</td>"; echo "<td>".$item['sendWords']."</td>"; echo "<td>".$item['receivedWords']."</td>"; echo "<td>".$item['sendURLs']."</td>"; echo "<td>".$item['receivedURLs']."</td>"; echo "<td>".$item['type']."</td>"; echo "<td>".$item['direct']."</td>"; echo "<td>".$item['acceptcrawl']."</td>"; echo "<td>".$item['dhtreceive']."</td>"; echo "<td>".$item['rankingreceive']."</td>"; echo "<td>".$item['location']."</td>"; echo "<td>".$item['seedurl']."</td>"; echo "<td>".$item['age']."</td>"; echo "<td>".$item['seeds']."</td>"; echo "<td>".$item['connects']."</td>"; echo "<td>".$item['address']."</td>"; echo "<td>".$item['useragent']."</td>"; echo "</tr>"; } echo "</table>"; } ?>
Parse JSON
<?php $json = new Services_JSON(); $json = $json->decode($body); $ll=$json->loglines; if (!$ll) break; foreach ($ll as $logline) { echo $logline->logline."\n"; echo $logline->loglevel." "; } ?>
Use the YaCy API
To shorten up repeating tasks, the php class offers most basic steering functions, and allows easy crawls starts or extended searches.
For using this class the php must be included and at least a peer adress/port has to be defined.
<?php // Include the API PHP Library require 'YaCyAPI2.php'; // start the class $search = new YaCyAPI();
Now severel optional paramters may be set using setProperties or status information may be queried with getProperties.
Sample Application: YaCymin
This sample is using just a few basic YaCy applets for steering and querying a YaCy peer.
- Dev:APIStatus is used for basic steering like crawl pause/resume and restart/shutdown.
- Dev:APINetwork is used for querying information about the peers network and his own status.
- Dev:APICrawlProfileEditor is used for querying peers crawl profiles.
- Dev:APICrawlStart is used for initializing new crawls and editing/deleting existing ones.
Application index
Each YaCymin applet is called inside this main index file selecting the action paramter.
<?php // Include the API PHP Library require 'YaCyAPI2.php'; include 'JSON.php'; // start the class $search = new YaCyAPI(); //html headers require 'header_inc.php'; $action=$_GET['action']; switch ($action) { case "peers": default: require 'inc_peers.php'; break; case "peerlist": require 'inc_peerlist.php'; break; case "search": require 'inc_search.php'; break; } #endswitch require 'footer_inc.php'; echo "</body></html>";
At first all css, javascript and php includes are loaded and main menu is shown using header_inc.php.
Getting peer status
This applet is called for a quick status overview showing all peers defined in peerlist_inc.php.
// show overview echo "<h3><font color=grey> ".date('d-m H:i',time())."</font></h3>"; include 'peerlist_inc.php'; $maxpeers=count($this_YaCyPeer)+1; for ($peerno=0;$peerno<$maxpeers;$peerno++) { $peer=$this_YaCyPeer[$peerno][0]; $port=$this_YaCyPeer[$peerno][1]; $appid=$this_YaCyPeer[$peerno][2]; $name=$this_YaCyPeer[$peerno][3]; #$info=$search->peerStatus($peerno); #get peerinfo from internal list $res=$search->setProperties($peer.":".$port,$appid,$name); $info=$search->ping(); if ($info['host']) #peer defined? { $t="Showing peer-name / address and time for a ping / time for executing commands / peer uptime. Click for more infos."; echo '<div class="blogtoy"><h2 title=" '.$t.' "class="widgetheader">'; if ($info['status']) #peer online { #$items=$search->peerInfo($peerno); #extended info $items=$search->getProperties(); $info['name']=$items['name']; echo "<table border=0 width=100%><tr><td width=50%>"; if ($info['name']) { echo "<i>".$info['name']."</i> is "; } else { echo $info['host'].":".$info['port']." is "; } echo "<font color=green>on</font>"; echo "</td><td>"; $ti1=round(($search->_dur1*1000),0); $ti2=round(($search->_dur2*1000),0); switch ($ti1) { case ($ti1 > 300): $ti1="<font color=red>".$ti1."</font>"; break; case ($ti1 > 100): $ti1="<font color=orange>".$ti1."</font>"; break; } switch ($ti2) { case ($ti2 > 5000): $ti2="<font color=red>".$ti2."</font>"; break; case ($ti2 > 500): $ti2="<font color=orange>".$ti2."</font>"; break; } echo "<font size=1> (<font color=grey>ping:</font>".$ti1." <font color=grey>cmd:</font>".$ti2." ms)</font></td>"; echo "<td align = right>"; echo " ".$items['uptime']." "; echo "</td></tr></table>"; } else { if ($info['name']) { echo "<i>".$info['name']."</i> is "; } else { echo $info['host'].":".$info['port']." is "; } echo "<font color=red>off"; echo "</font>"; echo ' <a href=?action=SSH&host='.$info['host'].' title="SSH"><img src=images/cats.png></a>'; } echo '</h2> <div class="innerwidget">'; if ($info['status']) { echo "<div align=center><h4>Peer Details:</h4>"; echo $items['type']." ".$items['name']." (".$info['host'].":".$info['port'].")"; echo " subversion:" .$items['version']; echo "<br>links/words: ".$items['links']."/".$items['words']." - "; if ($items['acceptindex']) echo "accepts DHT-in "; if ($items['acceptcrawl']) echo "is remote crawling"; echo "<br><a href=http://".$info['host'].":".$info['port']."/Status.html>Admin</a>"; echo " - <a href=?action=log&peer=".$peerno.">Log</a>"; echo " - <a href=?action=SSH&host=".$info['host'].">SSH</a>"; if ($ym_mode=="admin"){ echo " - <a href=http://".$info['host'].":".$info['port']."/Steering.html?restart=><font color=orange>Restart</font></a>"; echo " - <a href=http://".$info['host'].":".$info['port']."/Steering.html?shutdown=1><font color=red>Shutdown</font></a>"; } echo "<br><img src=http://".$info['host'].":".$info['port']."/Banner.png?textcolor=000000&bgcolor=ddeeee&bordercolor=aaaaaa>"; echo "</div>"; echo '<div class="blogtoy">'; echo '<h2 class="widgetheader"><font>Performance...</font></h2>'; echo '<div class="innerwidget">'; echo '<iframe src="http://'.$info['host'].":".$info['port'].'/rssTerminal.html?set=PEERNEWS,REMOTESEARCH,LOCALSEARCH,REMOTEINDEXING,LOCALINDEXING,INDEXRECEIVE&width=600px&height=180px&maxlines=20&maxwidth=120" style="width:600px;height:180px;margin:0px;border:1px solid black;" scrolling="no" name="newsframe"></iframe><br />'; echo "<p><img src=http://".$info['host'].":".$info['port']."/PerformanceGraph.png></p>"; echo "<hr>"; echo '</div></div>'; } else { echo "Offline"; } echo "</div></div>"; #blogtoy echo "</div>"; } # end if peer defined } #end for peerno
Getting Network status
echo "<h3><font color=grey> Active peers in network </font></h3>"; include 'peerlist_inc.php'; $command="Network.xml?page=1"; $peerno="0"; $peer=$this_YaCyPeer[$peerno][0]; $port=$this_YaCyPeer[$peerno][1]; $appid=$this_YaCyPeer[$peerno][2]; $name=$this_YaCyPeer[$peerno][3]; $peername="http://".$peer.":".$port."/"; $res=$search->setProperties($peer.":".$port,$appid,$name); $info=$search->ping(); if ($info['host']) #peer defined? { $results = $search->peerCommandDirect($peername,$command); //now we have xml, put it in a simple array $resultarray=xml2array($results); #, $get_attributes = 1, $priority = 'tag'); //get items only $items=$resultarray['peers']['peer']; if ($items) { echo "<table>"; foreach ($items as $item) { if ($tr=="ffffff") {$tr="aaaaaa";} else {$tr="ffffff";} echo "<tr bgcolor=#".$tr.">"; #echo "<td>".$item['hash']."</td>"; echo "<td>".$item['fullname']."</td>"; echo "<td>".$item['type']."</td>"; echo "<td>".$item['version']."</td>"; echo "<td>".$item['ppm']."</td>"; echo "<td>".$item['qph']."</td>"; echo "<td>".$item['uptime']."</td>"; echo "<td>".$item['links']."</td>"; echo "<td>".$item['words']."</td>"; #echo "<td>".$item['rurls']."</td>"; #echo "<td>".$item['lastseen']."</td>"; #echo "<td>".$item['sendWords']."</td>"; #echo "<td>".$item['receivedWords']."</td>"; #echo "<td>".$item['sendURLs']."</td>"; #echo "<td>".$item['receivedURLs']."</td>"; echo "<td>".$item['direct']."</td>"; echo "<td>".$item['acceptcrawl']."</td>"; echo "<td>".$item['dhtreceive']."</td>"; echo "<td>".$item['rankingreceive']."</td>"; # echo "<td>".$item['location']."</td>"; # echo "<td>".$item['seedurl']."</td>"; echo "<td>".$item['age']."</td>"; #echo "<td>".$item['seeds']."</td>"; #echo "<td>".$item['connects']."</td>"; echo "</tr>"; } echo "</table>"; } #end for } #end if peer online