Inhaltsverzeichnis

1 Introduction
2 API reference
3 Accessing the APIs using non-java frameworks
4 Steering a peer
5 Resources

Introduction

Besides the web interface, YaCy offers a rich XML and JSON based API for interaction. Some of these interfaces can also be accessed via html, and these pages are integrated in YaCy web interface. When you access such a page, a 'API' tooltip icon appears on the right upper corner of the web page, and a mouseover shows a short introduction of the API. The API icon itself links to the XML, JSON or similar API file that presents the shown data in annotated form. Please note that these tooltips and the underlying link to the API path change every time you navigate to another YaCy page, even if the icon looks the same, it will always link to the data that you just see at the web page.

API reference

There are different 'generations' of YaCy APIs:

servlets in /yacy/* .. these had been there first and contain the basic peer-to-peer bootstrap, search and DHT transfer servlets. These servlets are used only for peer-to-peer communication
servlets in /api/* .. additional servlets to support ajax functions of the YaCy administratin interface, to be used locally on the same peer
any other servlet which clones the content of a web page but has a .xml or .json extension. These API servlets are marked within the administratin interface with an orange 'API' icon in the top right corner of the administration web page
the solr search servlet at /solr/select of the embedded Solr search server which provides exactly the same functionality like the original Solr as described in the Solr wiki for queries.

Search Interface

/
/yacysearch.rss and /yacysearch.json	YaCy search page returning xml (opensearch) or json results
/suggest.xml and /suggest.json	YaCy suggest interface returning xml (opensearch-compliant) or json results
/solr
/solr/select	The (original) Solr search api embedded into YaCy
/gsa
/gsa/searchresult	The (re-implemented) Google Search Appliance API embedded into YaCy

Peer-to-Peer Communication

/yacy
/yacy/seedlist.html / /yacy/seedlist.json / /yacy/seedlist.xml	the YaCy p2p network bootstraping seed list
/yacy/crawlReceipt.html	feedback from a remote peer to transmit metadata of loaded (remote-crawled) content
/yacy/hello.html	called for a peer-ping, the network keep-alive process
/yacy/idx.json	retrieve the known internet network structure based on inter-host links
/yacy/list.html	lists shared blacklists
/yacy/message.html	send a message to a remote peer
/yacy/profile.html	get the profile of a remote peer
/yacy/query.html	query specific throughput and sizing parameters from the remote peer
/yacy/search.html	DHT search on the remote peer
/yacy/transferRWI.html	send a RWI to a remote peer
/yacy/transferURL.html	send URL metadata to a remote peer
/yacy/urls.xml	ask for remote crawl URL lists that the requesting peer wants to load

AJAX Services for the local peer

Some of these servlets are protected (all servlets ending with '_p' and can only be accessed from localhost or after authorization.

/api
/api/blacklists_p.xml
/api/config_p.xml
/api/feed.rss
/api/push_p.json
/api/queues_p.xml
/api/schema.xml
/api/status_p.xml
/api/table_p.xml
/api/version.xml	YaCy SVN version
/api/webstructure.xml
/api/yacydoc.xml
/api/util
/api/util/getpageinfo_p.xml	crawling information for single url
/api/util/ynetSearch.xml
/api/blacklists
/api/blacklists/get_metadata_p[.xml \| .json]	list of all blacklists and their metadata
/api/blacklists/get_list_p[.xml \| .json]	matadata and content of a specific list
/api/blacklists/add_entry_p[.xml \| .json]	adds new entry to blacklist
/api/blacklists/delete_entry_p[.xml \| .json]	adds new entry to blacklist
/api/bookmarks
/api/bookmarks/get_bookmarks[.xml \| .json]
/api/bookmarks/posts
/api/bookmarks/posts/add_p.xml
/api/bookmarks/posts/all.xml
/api/bookmarks/posts/delete_p.xml
/api/bookmarks/posts/get.xml
/api/bookmarks/tags
/api/bookmarks/tags/getTag.xml
/api/bookmarks/tags/rename_p.xml
/api/bookmarks/xbel
/api/bookmarks/xbel/xbel.xml

HTML Servlets

These servlets are used for online administration of a YaCy peer but they can also be used for a remote steering by just calling the http interface with i.e. wget or curl.

/
/AccessTracker_p.xml	peer access statistics
/Blog.xml	YaCy blog
/Crawler_p.html	start a web crawl
/CrawlStartExpert.html	start a web crawl in expert mode
/CrawlProfileEditor_p.xml	show and edit crawl profiles
/IndexDeletion_p.html	delete documents from solr index
/IndexSchema_p.html	show and edit the Solr index schema
/Messages_p.xml
/Network.xml	peer and network statistics
/News.rss
/opensearchdescription.xml
/PerformanceMemory_p.xml	peer memory status
/PerformanceQueues_p.xml	peer status of busy queues
/QuickCrawlLink_p.xml	single url crawl start with immediate confirmation
/Status.html	peer steering: shutdown, restart ,pause/resume crawls
/Steering.html	update peer
/ViewProfile.xml	view peer profile

Accessing the APIs using non-java frameworks

The example above showed how to retrieve information from a peer by simply calling the appropriate applet and encoding the delivered xml. The easiest way to explore other API calls is to perform the desired action in theYaCy admin interface and use the same parameters while calling the rss or xml applet ie: Network.xml instead of Network.html. Most actions that had been issued on the YaCy interface to change the configuration or to request crawl actions can be examined on page Table_API_p.html.
After having received the query results the delivered xml or json must be converted into a SimpleXML object or Array. The client then iterates over the elements in the response, processing each one using a foreach() loop and retrieving the information sent by the peer. Heres how some sample peer information is retrieved using PHP or similar languages for web applications.

Handling XML with PHP5

Open a connection to the desired peer and send a http request. A PHP5 class Dev:YaCyAPIforPHP is available for simple handling of requests to one or multiple YaCy peers.
A native http request could be handled by cURL like shown in this example:

<?php
  // method using native php-curl
  $YaCyURL="http://mypeer.tld:8090/";  
  $cu=$YaCyURL."Status.html";
  $queryServer = curl_init($cu);     
  curl_setopt($queryServer, CURLOPT_HEADER, 0);
  curl_setopt($queryServer, CURLOPT_RETURNTRANSFER, 1);
  curl_setopt($queryServer, CURLOPT_USERPWD,$appID);
  $results = curl_exec($queryServer);
  curl_close($queryServer);  
?>

The peers friendly name is stored in the <your> node collection, the sample accesses this node collection as yourpeer and stores the information like name or hash in yourpeer->name or yourpeer->hash
The networks URL count is stored in the <all> node collection, the sample accesses this node collection as allpeers and stores the information in allpeers->count'

<?php
  //method using YaCyapi.php
  require 'YaCyAPI2.php';
  // start the class 
 search = new YaCyAPI();
 $results = $search->peerCommand("Network.xml");                                       
 //now we have xml, put it in a simple array
 $resultarray=xml2array($results); #convert to php-array
 //get items 
 $yourpeer=$resultarray['peers']['your'];
 $peername=$yourpeer['name']
 $peerhash=$yourpeer['hash']
 //
 $allpeers=$resultarray['peers']['all'];
 $urlcount=$allpeers['count']
?>

The returned XML string is now converted to an array (xml2array).
This example is calling Network.xml with the page parameter to retrieve information about all peers in the queried network.

<?php
$results = $search->peerCommand("Network.xml?page=1"); 
//now we have xml, put it in a simple array
$resultarray=xml2array($results);;
//get items only
$items=$resultarray['peers']['peer'];
if ($items)
{
  echo "<h1>Active Peers</h1>";
  echo "<table>";
  foreach ($items as $item)
  {
   if ($tr=="ffffff") {$tr="aaaaaa";} else {$tr="ffffff";}
   echo "<tr bgcolor=#".$tr.">";
   echo "<td>".$item['hash']."</td>";
   echo "<td>".$item['fullname']."</td>";
   echo "<td>".$item['type']."</td>";
   echo "<td>".$item['version']."</td>";
   echo "<td>".$item['ppm']."</td>";
   echo "<td>".$item['qph']."</td>";
   echo "<td>".$item['uptime']."</td>";
   echo "<td>".$item['links']."</td>";
   echo "<td>".$item['words']."</td>";
   echo "<td>".$item['rurls']."</td>";
   echo "<td>".$item['lastseen']."</td>";
   echo "<td>".$item['sendWords']."</td>";
   echo "<td>".$item['receivedWords']."</td>";
   echo "<td>".$item['sendURLs']."</td>";
   echo "<td>".$item['receivedURLs']."</td>";
   echo "<td>".$item['direct']."</td>";
   echo "<td>".$item['acceptcrawl']."</td>";
   echo "<td>".$item['dhtreceive']."</td>";
   echo "<td>".$item['rankingreceive']."</td>";
   echo "<td>".$item['location']."</td>";
   echo "<td>".$item['seedurl']."</td>";
   echo "<td>".$item['age']."</td>";
   echo "<td>".$item['seeds']."</td>";
   echo "<td>".$item['connects']."</td>";
   echo "</tr>"; 
  }
  echo "</table>";
}
?>

Handling JSON with PHP5

Some applets could be called to deliver JSON instead of XML. Results are delivered a bit faster and most parsers are able to decode returned data quicker so this format should be preferred to speed up things.

Handling XML or JSON with Ruby

The Ruby gem [httparty] makes it easy to consume REST like APIs and offers great flexibility. It is used in this example doing a quick search for 25 global results querying for 'test' and echos the links found.

XML Example

require 'httparty'

class YaCy
  include HTTParty

  format :xml
  base_uri 'http://localhost:8090'

  def self.search(q)
    get('/yacysearch.rss', :query => {
      :query => q,
      :resource => 'global',
      :verify => false,
      :maximumrecords => 25
    })
  end
end

begin
  channel = YaCy.search('test')
  channel['rss']['channel']['item'].each do |item|
    puts item['link']
  end
rescue
  p 'oops nothing found'
end

JSON Example

require 'httparty'

class YaCy
  include HTTParty

  format :json
  base_uri 'http://localhost:8090'

  def self.search(q)
    get('/yacysearch.json', :query => {
      :query => q,
      :resource => 'global',
      :verify => false,
      :maximumrecords => 25
    })
  end
end

begin
  channel = YaCy.search('test')
  channel['channels'].first['items'].each do |item|
    puts item['link']
  end
rescue
  p 'oops nothing found'
end

Handling XML or JSON with perl

For perl a library [Ismael] is available to handle request and returned results.

Steering a peer

To intiate functions without awaiting a delivered result, like pausing/resuming crawls or shutdown the peer, just call the applet as in the admin-interface.

http://localhost:8090/Steering.html?restart=

will restart the peer after confirming admin credentials if not delivered with the query via http basic-auth. As the peer doesnt have to confirm this action nor does it has a need to deliver any data, no data must be parsed by the client.

Resources

As these examples show the YaCy API is very useful when you try to mash up data found or delivered by YaCy with data from other services, or simply build a customized interface for the YaCy community.
For more information about REST, XML, JSON and implementations in popular web programming languages see also

Dev:API

Inhaltsverzeichnis

Introduction

API reference

Search Interface

Peer-to-Peer Communication

AJAX Services for the local peer

HTML Servlets

Accessing the APIs using non-java frameworks

Handling XML with PHP5

Handling JSON with PHP5

Handling XML or JSON with Ruby

XML Example

JSON Example

Handling XML or JSON with perl

Steering a peer

Resources

Navigationsmenü

Meine Werkzeuge

Namensräume

Varianten

Ansichten

Mehr

Suche

Gemeinschaftsportal

Navigation

Werkzeuge