Beginning with yuuvis RAD version 9.0, the Extraction Service can be extended by plugins.
This is based on the framework PF4J: https://pf4j.org/
Inside the extractionservice the plugins are loaded via PF4J-Spring:
https://github.com/pf4j/pf4j-spring
Including an existing plugin in the Extraction Service
The deployment of an existing plugin is explained by setting up the example plugin for processing PGN files: https://en.wikipedia.org/wiki/Portable_Game_Notation
Configure the Extraction Service
By default, the use of plugins is disabled. In the extraction-prod.yml file, these settings can be adjusted:
plugins: enabled: true directory: "../../data/plugins"
The jar file of the plugin must then be stored in the specified directory:
The plugin is activated when the Extraction Service is restarted.
Any number of plugin files can be stored in this directory.
Check the logging of the plugin
When the Extraction Service is restarted, available plug-ins are read in. Information about this is output in the log.
If the "plugins.enabled" switch has not been set to "true", this will be indicated in the log:
... c.o.s.e.routing.PluginProcessor : plugins not enabled
If the plugins have been activated but no plugins have been put in the directory:
... c.o.s.e.routing.PluginProcessor : plugins enabled: ../../data/plugins ... c.o.s.e.routing.PluginProcessor : plugins found: 0
If plugins are found, they are listed with their names and "namespaces":
... c.o.s.e.routing.PluginProcessor : plugins enabled: ./../data/plugins ... c.o.s.e.routing.PluginProcessor : plugins found: 2 ... c.o.s.e.routing.PluginProcessor : plugin 'PgnFile': PGNFILE ... c.o.s.e.routing.PluginProcessor : plugin 'CAD-File': CAD
When the plugins become active for a file, the number of each extracted is in the log:
... c.o.s.e.routing.PluginProcessor : extracted by plugin 'PgnFile': 7
Result of the extraction
The result of the extraction from this PGN file will look like this:
"OS:FileName": "schach_PGN.pgn", "OS:FileSize": "740", "OS:ModifyDate": "2023-02-10T09:42:24+01:00", "OS:CreateDate": "2023-02-10T09:42:24+01:00", "OS:FileType": "TXT", "OS:MimeType": "text/plain", "ExifTool:ExifToolVersion": "12.52", "System:FileName": "schach_PGN.pgn", "System:Directory": "C:/Temp/Local/Temp/20230210.094224.115-1/389a8d00-25f8-4f84-9d08-238f0e81594c", "System:FileSize": "740", "System:FileModifyDate": "2023:02:10 09:42:24+01:00", "System:FileAccessDate": "2023:02:10 09:42:24+01:00", "System:FileCreateDate": "2023:02:10 09:42:24+01:00", "System:FilePermissions": "100666", "File:FileType": "TXT", "File:FileTypeExtension": "TXT", "File:MIMEType": "text/plain", "File:MIMEEncoding": "us-ascii", "File:Newlines": "\n", "File:LineCount": "16", "File:WordCount": "158", "PGNFILE:Site": "Belgrade, Serbia JUG", "PGNFILE:White": "Fischer, Robert J.", "PGNFILE:Event": "F/S Return Match", "PGNFILE:Round": "29", "PGNFILE:Black": "Spassky, Boris V.", "PGNFILE:Date": "1992-11-04T12:00:00+01:00", "PGNFILE:Result": "1/2-1/2",
Creating a plugin
It follows the description of what to set up and configure for creating a new plugin, deploying, and checking it.
The plugin must comply with the interface defined by PF4J-Spring. In particular, this is a constructor with a PluginWrapper as a parameter:
public PgnFilePlugin(PluginWrapper wrapper) { super(wrapper); }
1. Include the library provided by Optimal Systems for the definition of the plugin interface
The library "extractionservice-plugin-interface" is available as a jar file and must be added to the project. This defines the interface for the Extraction Service and also brings the plugin capability of PF4J.
It is important that the dependency of the plugin on the interface library is defined as "provided" (in gradle "compileOnly"):
<dependency> <groupId>com.os.enaio.services.extraction</groupId> <artifactId>extractionservice-plugin-interface</artifactId> <version>9.0.2</version> <scope>provided</scope> </dependency>
2. Implement the plugin
As an example of a plugin the library "extractionservice-plugin-demo" is also available as a jar-file:
This Example operates on PGN-Files: https://en.wikipedia.org/wiki/Portable_Game_Notation
Plugin example code:
package com.os.services.extraction.plugin.demo; import com.os.services.extraction.plugin.ExtractionDriver; import com.os.services.extraction.plugin.ExtractionInfo; import org.apache.commons.io.IOUtils; import org.pf4j.Extension; import org.pf4j.Plugin; import org.pf4j.PluginWrapper; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import java.io.File; import java.io.FileInputStream; import java.util.HashMap; import java.util.Map; public class PgnFilePlugin extends Plugin { /** * Constructor to be used by plugin manager for plugin instantiation. * Your plugins have to provide constructor with this exact signature to * be successfully loaded by manager. * * @param wrapper */ public PgnFilePlugin(PluginWrapper wrapper) { super(wrapper); } private final static Logger LOGGER = LoggerFactory.getLogger(PgnFilePlugin.class); @Override public void start() { LOGGER.info("Start"); } @Override public void stop() { LOGGER.info("Stop"); } @Extension public static class PgnFileProcessor implements ExtractionDriver { private final static Logger LOGGER = LoggerFactory.getLogger(PgnFileProcessor.class); private static final String PREFIX = "PGNFILE"; @Override public ExtractionInfo info() { return null; } @Override public String getExtractionDriverName() { return "PgnFile"; } @Override public String getExtractionNamespace() { return PREFIX; } @Override public boolean isApplicable (File file) { if (file.getName().endsWith(".pgn")) { return true; } return false; } @Override public Map<String, String> extractData( File file ) { Map<String, String> result = new HashMap<>(); try { String str = IOUtils.toString(new FileInputStream(file), "UTF-8"); String[] tokens = str.split("]"); if (tokens != null) { for (String token : tokens) { token = token.trim(); if (token.startsWith("[")) { String[] tag = token.substring(1).split("\""); if (tag.length >= 2) { //do not add prefix to key //it will be set by extractionservice String key = tag[0].trim(); String value = tag[1].trim(); if (key.endsWith("Date")) { value = value.replaceAll("\\?{2}", "01") .replaceAll("\\.", "\\-") + "T12:00:00+01:00"; } result.put(key, value); } if (tag.length == 1) { //do not add prefix to key //it will be set by extractionservice String key = tag[0].trim(); result.put(key, ""); } } } } } catch (Exception e) { LOGGER.error(e.getMessage()); } return result; } } }
3. Build the jar-file
If the plugin is packaged in a jar file, there are important points to note:
The following attributes must be specified in the MANIFEST.MF file (in the example plugin):
Plugin-Class: com.os.services.extraction.plugin.demo.PgnFilePlugin Plugin-Dependencies: Plugin-Id: PgnFileProcessor-Plugin Plugin-Provider: os Plugin-Version: 9.0.0
To have these values automatically inserted into the jar file, the "maven-assembly-plugin" can be configured accordingly:
<plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-assembly-plugin</artifactId> <version>3.1.0</version> <configuration> <descriptorRefs> <descriptorRef>jar-with-dependencies</descriptorRef> </descriptorRefs> <finalName>${project.artifactId}-${project.version}-all</finalName> <appendAssemblyId>false</appendAssemblyId> <attach>false</attach> <archive> <manifest> <addDefaultImplementationEntries>true</addDefaultImplementationEntries> <addDefaultSpecificationEntries>true</addDefaultSpecificationEntries> </manifest> <manifestEntries> <Plugin-Id>${plugin.id}</Plugin-Id> <Plugin-Version>${plugin.version}</Plugin-Version> <Plugin-Provider>${plugin.provider}</Plugin-Provider> <Plugin-Class>${plugin.class}</Plugin-Class> <Plugin-Dependencies>${plugin.dependencies}</Plugin-Dependencies> </manifestEntries> </archive> </configuration> <executions> <execution> <id>make-assembly</id> <phase>package</phase> <goals> <goal>single</goal> </goals> </execution> </executions> </plugin>
The jar file must contain all required libraries (except extractionservice-plugin-interface and thus also pf4j).
The MANIFEST.MF file must have the appropriate entries.
The extensions.idx file must be included. This is automatically generated by PF4J during build.
This generated jar file can now be placed in the appropriately configured directory of the Extractionservice.
4. Checking the plugin
To check the new plugin, the extractionservice must be configured like described in Including an existing plugin in the extractionservice.