Extending the Extraction-Service Using Plug-ins
Beginning with yuuvis® RAD 9.0, the extraction-service can be extended by plug-ins.
Including an Existing Plug-in in the Extraction-Service
The deployment of an existing plug-in is explained by setting up an example plug-in for processing PGN files.
Configuring the Extraction-Service
By default, the use of plug-ins is disabled. In the extraction-prod.yml
file, these settings can be adjusted:
plugins:
enabled: true
directory: '../../data/plugins'
The jar file of the plug-in must then be stored in the specified directory:
The extraction-service must be restarted to enable the plug-in.
Any number of plug-in files can be stored in this directory.
Checking the Logging of the Plug-in
As soons as the extraction-service has been restarted, available plug-ins are detected. Information on these plug-ins can be found in the log.
If the plugins.enabled
switch is not set to true
, the following will be protocolled in the log:
... c.o.s.e.routing.PluginProcessor : plugins not enabled
If the plug-ins are enabled, but not stored in the directory, the log will look as follows:
... c.o.s.e.routing.PluginProcessor : plugins enabled: ../../data/plugins
... c.o.s.e.routing.PluginProcessor : plugins found: 0
If plug-ins are found, they will be listed with their names and namespaces
:
... c.o.s.e.routing.PluginProcessor : plugins enabled: ./../data/plugins
... c.o.s.e.routing.PluginProcessor : plugins found: 2
... c.o.s.e.routing.PluginProcessor : plugin 'PgnFile': PGNFILE
... c.o.s.e.routing.PluginProcessor : plugin 'CAD-File': CAD
If plug-ins are active for a specific file, the number of each extracted plug-in will be in the log:
... c.o.s.e.routing.PluginProcessor : extracted by plugin 'PgnFile': 7
Result of the Extraction
The result of the extraction from this PGN file will look like this:
"OS:FileName": "schach_PGN.pgn",
"OS:FileSize": "740",
"OS:ModifyDate": "2023-02-10T09:42:24+01:00",
"OS:CreateDate": "2023-02-10T09:42:24+01:00",
"OS:FileType": "TXT",
"OS:MimeType": "text/plain",
"ExifTool:ExifToolVersion": "12.52",
"System:FileName": "schach_PGN.pgn",
"System:Directory": "C:/Temp/Local/Temp/20230210.094224.115-1/389a8d00-25f8-4f84-9d08-238f0e81594c",
"System:FileSize": "740",
"System:FileModifyDate": "2023:02:10 09:42:24+01:00",
"System:FileAccessDate": "2023:02:10 09:42:24+01:00",
"System:FileCreateDate": "2023:02:10 09:42:24+01:00",
"System:FilePermissions": "100666",
"File:FileType": "TXT",
"File:FileTypeExtension": "TXT",
"File:MIMEType": "text/plain",
"File:MIMEEncoding": "us-ascii",
"File:Newlines": "\n",
"File:LineCount": "16",
"File:WordCount": "158",
"PGNFILE:Site": "Belgrade, Serbia JUG",
"PGNFILE:White": "Fischer, Robert J.",
"PGNFILE:Event": "F/S Return Match",
"PGNFILE:Round": "29",
"PGNFILE:Black": "Spassky, Boris V.",
"PGNFILE:Date": "1992-11-04T12:00:00+01:00",
"PGNFILE:Result": "1/2-1/2",
Creating a Plug-in
This section describes how to set up and configure a new plug-in including its deployment and check.
The plug-in capability of the extraction-service is based on the PF4J framework.
The extraction-service loads plug-ins via PF4J-Spring.
The plug-in must comply with the interface defined by PF4J-Spring. In particular, this is a constructor with a PluginWrapper as a parameter:
public PgnFilePlugin(PluginWrapper wrapper) {
super(wrapper);
}
Including the Library for the Definition of the Plug-in Interface
OPTIMAL SYSTEMS provides a library for the defintion of the plug-in interface. The extractionservice-plugin-interface.jar
file is available for download below and must be added to the project. It defines the interface for the extraction-service and also brings the plug-in capability of PF4J.
It is important that the dependency of the plug-in to the interface library is defined as provided
(in gradle compileOnly
):
<dependency>
<groupId>com.os.enaio.services.extraction</groupId>
<artifactId>extractionservice-plugin-interface</artifactId>
<version>9.0.2</version>
<scope>provided</scope>
</dependency>
Implementing the Plug-in
As an example of a plug-in, the extractionservice-plugin-demo.jar
file is also avaible for download:
The following example operates on PGN files.
Plug-in example code:
package com.os.services.extraction.plugin.demo;
import com.os.services.extraction.plugin.ExtractionDriver;
import com.os.services.extraction.plugin.ExtractionInfo;
import org.apache.commons.io.IOUtils;
import org.pf4j.Extension;
import org.pf4j.Plugin;
import org.pf4j.PluginWrapper;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.io.File;
import java.io.FileInputStream;
import java.util.HashMap;
import java.util.Map;
public class PgnFilePlugin extends Plugin {
/**
* Constructor to be used by plugin manager for plugin instantiation.
* Your plugins have to provide constructor with this exact signature to
* be successfully loaded by manager.
*
* @param wrapper
*/
public PgnFilePlugin(PluginWrapper wrapper) {
super(wrapper);
}
private final static Logger LOGGER = LoggerFactory.getLogger(PgnFilePlugin.class);
@Override
public void start() {
LOGGER.info("Start");
}
@Override
public void stop() {
LOGGER.info("Stop");
}
@Extension
public static class PgnFileProcessor implements ExtractionDriver {
private final static Logger LOGGER = LoggerFactory.getLogger(PgnFileProcessor.class);
private static final String PREFIX = "PGNFILE";
@Override
public ExtractionInfo info() {
return null;
}
@Override
public String getExtractionDriverName() {
return "PgnFile";
}
@Override
public String getExtractionNamespace() {
return PREFIX;
}
@Override
public boolean isApplicable (File file) {
if (file.getName().endsWith(".pgn")) {
return true;
}
return false;
}
@Override
public Map<String, String> extractData( File file ) {
Map<String, String> result = new HashMap<>();
try {
String str = IOUtils.toString(new FileInputStream(file), "UTF-8");
String[] tokens = str.split("]");
if (tokens != null) {
for (String token : tokens) {
token = token.trim();
if (token.startsWith("[")) {
String[] tag = token.substring(1).split("\"");
if (tag.length >= 2) {
//do not add prefix to key
//it will be set by extractionservice
String key = tag[0].trim();
String value = tag[1].trim();
if (key.endsWith("Date")) {
value = value.replaceAll("\\?{2}", "01")
.replaceAll("\\.", "\\-") + "T12:00:00+01:00";
}
result.put(key, value);
}
if (tag.length == 1) {
//do not add prefix to key
//it will be set by extractionservice
String key = tag[0].trim();
result.put(key, "");
}
}
}
}
} catch (Exception e) {
LOGGER.error(e.getMessage());
}
return result;
}
}
}
Building the jar
File
If the plug-in is packaged in a jar
file, there are important points to note:
The following attributes must be specified in the MANIFEST.MF
file (in the example plug-in):
Plugin-Class: com.os.services.extraction.plugin.demo.PgnFilePlugin
Plugin-Dependencies:
Plugin-Id: PgnFileProcessor-Plugin
Plugin-Provider: os
Plugin-Version: 9.0.0
for these values to be automatically inserted into the jar
file, the maven-assembly-plugin
can be configured accordingly:
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>3.1.0</version>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
<finalName>${project.artifactId}-${project.version}-all</finalName>
<appendAssemblyId>false</appendAssemblyId>
<attach>false</attach>
<archive>
<manifest>
<addDefaultImplementationEntries>true</addDefaultImplementationEntries>
<addDefaultSpecificationEntries>true</addDefaultSpecificationEntries>
</manifest>
<manifestEntries>
<Plugin-Id>${plugin.id}</Plugin-Id>
<Plugin-Version>${plugin.version}</Plugin-Version>
<Plugin-Provider>${plugin.provider}</Plugin-Provider>
<Plugin-Class>${plugin.class}</Plugin-Class>
<Plugin-Dependencies>${plugin.dependencies}</Plugin-Dependencies>
</manifestEntries>
</archive>
</configuration>
<executions>
<execution>
<id>make-assembly</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
The jar
file must contain all required libraries (except extractionservice-plugin-interface
and thus also pf4j).
The MANIFEST.MF
file must have the appropriate entries.
The extensions.idx
file must be included. It is automatically generated by PF4J during build.
This generated jar
file can now be placed in the appropriately configured directory of the extraction-service.
Checking the Plug-in
To check the new plugin, the extraction-service must be configured as described in Including an existing plugin in the extractionservice.