Interface IngestionHistoryApi
-
- All Known Subinterfaces:
IngestionHistoryApiBatchSupport
- All Known Implementing Classes:
MockUnclusteredIngestionHistory
,MockUnclusteredNoopIngestionHistory
public interface IngestionHistoryApi
TheIngestionHistoryApi
is used to provide incremental capabilities to ingestion clients. The API can be used to track the document IDs last ingested by the API, or signatures such as checksums or message digests. All ingestion history is associated with a namespace which can be the name of a connector or any other client name associated with a repeating, non-concurrent client. The operations via the API are not thread-safe within the same namespace.Events tracked by this API are not guaranteed to have occurred unless fault-tolerant ingestion is activated. This is the case because generally this API is used by ingestion clients that are marking what they are sending into the system. If a subsequent system failure occurs and fault tolerance is not activated, the visited keys may not have made it to the index. This situation requires use of the
clear(String)
method to guarantee documents are not inadvertently skipped due to considering them already indexed.This API is designed to handle the case where an input document (such as a CSV or Avro file) may contain multiple sub-documents that constitute the actual output to the Attivio system. In this case, the client can call
childCreated(String, String, String)
in addition to the visit method for the child. This will associate the child with the parent, allowing this association to be retrieved during subsequent ingests for proper incremental handling.$Revision$
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description void
childCreated(java.lang.String namespace, java.lang.String key, java.lang.String childKey)
Marks thechildKey
as one created by the record associated withkey
.void
clear(java.lang.String namespace)
Removes all historical information associated with thenamespace
.java.lang.Iterable<java.lang.String>
getChildren(java.lang.String namespace, java.lang.String key)
Returns an Iterable of the children that were marked as created via thechildCreated(String, String, String)
method.java.util.Date
getPreviousStartTime(java.lang.String namespace)
Returns the start time for the last session for thisnamespace
.byte[]
getSignature(java.lang.String namespace, java.lang.String key)
Returns the last signature associated with the key ornull
if no signature is present.java.util.Date
getStartTime(java.lang.String namespace)
Returns the start time of the current session for thisnamespace
.java.lang.Iterable<java.lang.String>
getUnvisited(java.lang.String namespace)
Returns anIterable
of the keys that have not been visited in the current session.java.lang.Iterable<java.lang.String>
getUnvisited(java.lang.String namespace, java.util.Date since)
Returns anIterable
of the keys that have not been visited since the timesince
.void
removeDocumentByKey(java.lang.String namespace, java.lang.String key)
Removes a document using the keyjava.util.Date
startSession(java.lang.String namespace)
Starts a new session for the namespace.void
visit(java.lang.String namespace, java.lang.String key, byte[] signature)
Records thekey
as having been visited updates its associated signature.
-
-
-
Method Detail
-
startSession
java.util.Date startSession(java.lang.String namespace) throws AttivioException
Starts a new session for the namespace. The current time is used to establish the start time for this session. Subsequent calls togetStartTime(String)
will return the new session start time.- Parameters:
namespace
- a namespace to use (e.g., connector or source name)- Returns:
- the session start time
- Throws:
AttivioException
-
getStartTime
java.util.Date getStartTime(java.lang.String namespace) throws AttivioException
Returns the start time of the current session for thisnamespace
. The return value is the same as the value returned by the laststartSession(String)
call for thisnamespace
. If no session has ever been started, returnsnull
.- Parameters:
namespace
- a namespace to use (e.g., connector or source name)- Returns:
- the start time for this namespace
- Throws:
AttivioException
-
getPreviousStartTime
java.util.Date getPreviousStartTime(java.lang.String namespace) throws AttivioException
Returns the start time for the last session for thisnamespace
. If no previous session exists, returnsnull
.- Parameters:
namespace
- a namespace to use (e.g., connector or source name)- Returns:
- previous start time
- Throws:
AttivioException
-
visit
void visit(java.lang.String namespace, java.lang.String key, byte[] signature) throws AttivioException
Records thekey
as having been visited updates its associated signature. The signature is commonly used to detect whether an object has changed since the last ingestion (by storing a checksum or message digest of the content). The visit time is also recorded.- Parameters:
namespace
- a namespace to use (e.g., connector or source name)key
- a record key or document ID.signature
- an arbitrary value indicating a signature to associate with the key, may benull
.- Throws:
AttivioException
-
getSignature
byte[] getSignature(java.lang.String namespace, java.lang.String key) throws AttivioException
Returns the last signature associated with the key ornull
if no signature is present.- Parameters:
namespace
- a namespace to use (e.g., connector or source name)key
- a record key or document ID.- Returns:
- the signature associated with the key
- Throws:
AttivioException
-
getUnvisited
java.lang.Iterable<java.lang.String> getUnvisited(java.lang.String namespace) throws AttivioException
Returns anIterable
of the keys that have not been visited in the current session. The unvisited keys are those for which there has been avisit(String, String, byte[])
call at some point, but not one since the laststartSession(String)
call. This allows a connector to record visits each time it runs and get a list of documents that are not present on subsequent runs. Since these documents have been removed from the source system, the connector may decide to remove them from the Attivio system.A call to
remove()
on the returned iterator will remove the visit and signature information from the history for the associated key. All child associations (seechildCreated(String, String, String)
of removed keys are also removed.- Parameters:
namespace
- a namespace to use (e.g., connector or source name)- Returns:
- a Iterable of unvisited keys
- Throws:
AttivioException
-
getUnvisited
java.lang.Iterable<java.lang.String> getUnvisited(java.lang.String namespace, java.util.Date since) throws AttivioException
Returns anIterable
of the keys that have not been visited since the timesince
. The unvisited keys are those for which there has been avisit(String, String, byte[])
call at some point, but not one since the timesince
. This allows a connector to record visits each time it runs and get a list of documents that are not present on subsequent runs. Since these documents have been removed from the source system, the connector may decide to remove them from the Attivio system.A call to
remove()
on the returned iterator will remove the visit and signature information from the history for the associated key. All child associations (seechildCreated(String, String, String)
of removed keys are also removed.- Parameters:
namespace
- a namespace to use (e.g., connector or source name)since
- the mininum visit date to consider a record as visited.- Returns:
- an Iterable of unvisited keys
- Throws:
AttivioException
-
childCreated
void childCreated(java.lang.String namespace, java.lang.String key, java.lang.String childKey) throws AttivioException
Marks thechildKey
as one created by the record associated withkey
. All children associated withkey
are returned by the Iterable returned by a subsequent call togetChildren(String, String)
. The expectation is that the child document has not previously been created. If the signature of the parent document id has changed remove the children documents and add the ones found in the new version of the parent document. Adding a child that already existed will have undefined consequences.This does not support children of children, only supports parent document and it's children. Children of children will have undefined consequences.
- Parameters:
namespace
- a namespace to use (e.g., connector or source name)key
- a record key or document ID.childKey
- a record key or document ID.- Throws:
AttivioException
-
getChildren
java.lang.Iterable<java.lang.String> getChildren(java.lang.String namespace, java.lang.String key) throws AttivioException
Returns an Iterable of the children that were marked as created via thechildCreated(String, String, String)
method. When using the returned Iterable and removing children then the expectation is that all the children are removed.- Parameters:
namespace
- a namespace to use (e.g., connector or source name)key
- a record key or document ID.- Returns:
- an Iterable of childKeys for the
key
- Throws:
AttivioException
-
clear
void clear(java.lang.String namespace) throws AttivioException
Removes all historical information associated with thenamespace
.- Parameters:
namespace
- a namespace to use (e.g., connector or source name)- Throws:
AttivioException
-
removeDocumentByKey
void removeDocumentByKey(java.lang.String namespace, java.lang.String key) throws AttivioException
Removes a document using the key- Parameters:
namespace
- a namespace to use (e.g., connector or source name)key
- a record key or document ID.- Throws:
AttivioException
-
-