| |
|
|

|
|
Development of
custom GATE plugins and resources for Named Entity Extraction on contracts.
|
| |
|
|

|
|
Customisation and hosting of Nutch for a vertical crawl.
|
| |
|
|

|
|
Auditing, redesign and optimisation
of a SOLR setup for a real estate search system using geo-location.
|
| |
|
|

|
|
Whole web crawling using Nutch on Amazon EC2. Development of
custom Nutch plugins and resources.
|
| |
|
|
|
|
Consulting
on GATE for Named Entity
Recognition; improvement of the accuracy of the ANNIE
application. |
| |
|
|

|
|
Port of
the RASP
application to Apache UIMA.
More details can be found on the Resources section.
|
| |
|
|

|
|
Design of
an avanced architecture for a search solution based on Nutch / Lucene. Work on a
performance benchmark and optimisation of the results.
Work on Term Extraction and Clustering, Text Classification, Ontology
Learning and custom Information
Extraction.
|
| |
|
|

|
|
Strategy
review and integration design for mobile content based search engines.
|
| |
|
|

|
|
Design
and implementation of a search solution based on SOLR.
|
| |
|
|

|
|
Development of
custom Nutch plugins and resources. Monitoring of crawls. Deployment and tuning of SOLR instances.
|
| |
|
|
 |
|
Implementation of a
search functionality based on Lucene and compliant with the OpenSearch standard.
Design and development of Text Classification web service.
It is used to identify junk posts from a collection of forum
pages indexed with Lucene. This
improves the relevance of the search engine results, as these
documents tend to rank high due to the repetition of keywords
(e.g. product names). The format of the messages used by the
service is based on Solr.
|
| |
|
|
 |
|
DigitalPebble has co-designed and
implemented the full-text search functionalities of Lingway KM, which
uses Lucene as a default
implementation. and illustrates both its versatility and
performance.
|