com.g11ntoolkit.strfile
Class LTXLIFFParser

java.lang.Object
  |
  +--org.xml.sax.helpers.DefaultHandler
        |
        +--com.g11ntoolkit.strfile.LTXLIFFParser
All Implemented Interfaces:
org.xml.sax.ContentHandler, org.xml.sax.DTDHandler, org.xml.sax.EntityResolver, org.xml.sax.ErrorHandler

public class LTXLIFFParser
extends org.xml.sax.helpers.DefaultHandler

Parses an XLIFF file format.

Uses SAX and processes the callbacks in the parsing lifecycle.

A G11NToolKit object is created or modified as needed for each element in the file. When the parse is complete, there will be a StrFile object and a TokFile object in memory with all of the appropriate objects within them. They will be ready for any further processing.

Version:
2005/07/06
Author:
Bill Rich, Wilandra Consulting LLC
Copyright © 2004-2005, Wilandra Consulting LLC. All rights reserved.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

See License Agreement.


Field Summary
private  java.lang.String aString
          A work area to accumulate characters into a string.
private  java.lang.String category
          A work area to hold the file category.
private  boolean contextFile
          A state variable indicating that the processing is for a Context File Name.
private  java.lang.String contextFileName
          A work area to hold the Context File Name.
private  boolean contextKey
          A state variable indicating that the processing is for a Context Key.
private  java.lang.String contextKeyName
          A work area to hold the Context Key Name.
private  java.lang.String datatype
          A work area to hold the data type.
private  java.lang.String fileName
          A work area to hold the file name for context information.
private  boolean inBody
          A state variable indicating that the processing is within a Body tag.
private  boolean inContext
          A state variable indicating that the processing is within a Context tag.
private  boolean inContextGroup
          A state variable indicating that the processing is within a Context Group tag.
private  boolean inFile
          A state variable indicating that the processing is within a File tag.
private  boolean inHeader
          A state variable indicating that the processing is within a Header tag.
private  boolean inIntFile
          A state variable indicating that the processing is within an Internal File tag.
private  boolean inNote
          A state variable indicating that the processing is within a Note tag.
private  boolean inSkeletonFile
          A state variable indicating that the processing is within a SKL tag.
private  boolean inSourceString
          A state variable indicating that the processing is within a Source tag.
private  boolean inTargetString
          A state variable indicating that the processing is within a Target tag.
private  boolean inTool
          A state variable indicating that the processing is within a Tool tag.
private  boolean inTU
          A state variable indicating that the processing is within a Translation Unit tag.
private  boolean inXLIFFFile
          A state variable indicating that the processing is within an XLIFF tag.
private static java.util.logging.Logger log
          The log used for all messages from this class.
protected static java.util.ResourceBundle mrb
          Messages used by the tools and classes.
private  java.io.OutputStreamWriter osw
          The output writer to use when writing out the file, if needed.
private  java.lang.String productName
          A work area to hold the product name.
private  java.lang.String productVersion
          A work area to hold the product version.
private  java.lang.String sourceLanguage
          A work area to hold the source language code.
private  StrFile strfile
          The target StrFile object to be loaded from the XLIFF file.
private  java.lang.String targetLanguage
          A work area to hold the target language code.
private  java.lang.String tokenNumber
          A work area to hold the Token Number.
private  TokFile tokfile
          The TokFile object to be loaded from the XLIFF file.
protected static java.util.ResourceBundle vrb
          Constants and variables used by the tools and classes.
private static java.util.ResourceBundle xrb
          Constants, messages, and variables used by the tools and classes for XML processing.
 
Constructor Summary
LTXLIFFParser()
           
 
Method Summary
 void characters(char[] ch, int start, int length)
          Processes character data (within an element).
 void endDocument()
          Processes the end of a Document parse.
 void endElement(java.lang.String namespaceURI, java.lang.String localName, java.lang.String rawName)
          Indicates the end of an element.
 void endPrefixMapping(java.lang.String prefix)
          Processes the end of a prefix mapping.
protected  void eoB()
          Process end of body element.
protected  void eoC()
          Process end of context element.
protected  void eoCG()
          Process end of context group element.
protected  void eoF()
          Process end of file element.
protected  void eoH()
          Process end of header element.
protected  void eoHN()
          Process end of note element when found in a header element.
protected  void eoIF()
          Process end of the internal file element.
protected  void eoSF()
          Process end of skeleton file element.
protected  void eoSS()
          Process end of source string element.
protected  void eoT()
          Process end of tool element.
protected  void eoTS()
          Process of end of target string element.
protected  void eoTU()
          Process end of translation unit element.
protected  void eoTUN()
          Process end of note element in a translation unit element.
 StrFile getStrFile()
          Returns the StrFile object that was loaded from the XLIFF file.
 TokFile getTokFile()
          Returns the TokFile object that was loaded from the XLIFF file.
 void ignorableWhitespace(char[] ch, int start, int length)
          Processes whitespace that can be ignored in the originating document.
 void processingInstruction(java.lang.String target, java.lang.String data)
          Processes a processing instruction (other than the XML declaration) when it is encountered.
 void setOSW(java.io.OutputStreamWriter anOSW)
          Sets output writer.
 void setStrFile(StrFile sf)
          Sets the StrFile object to load from the XLIFF file.
 void setTokFile(TokFile tf)
          Sets the TokFile object to load from the XLIFF file.
 void startDocument()
          Processes the start of a Document parse.
 void startElement(java.lang.String namespaceURI, java.lang.String localName, java.lang.String rawName, org.xml.sax.Attributes attrs)
          Processes the occurrence of an actual element.
 void startPrefixMapping(java.lang.String prefix, java.lang.String uri)
          Processes the beginning of an XML Namespace prefix mapping.
 
Methods inherited from class org.xml.sax.helpers.DefaultHandler
error, fatalError, notationDecl, resolveEntity, setDocumentLocator, skippedEntity, unparsedEntityDecl, warning
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

log

private static java.util.logging.Logger log
The log used for all messages from this class.


mrb

protected static java.util.ResourceBundle mrb
Messages used by the tools and classes.


vrb

protected static java.util.ResourceBundle vrb
Constants and variables used by the tools and classes.


xrb

private static java.util.ResourceBundle xrb
Constants, messages, and variables used by the tools and classes for XML processing.


osw

private java.io.OutputStreamWriter osw
The output writer to use when writing out the file, if needed.

The output writer can be set when it is needed. If the resulting object structure will not be written to a file then this never needs to be set.


strfile

private StrFile strfile
The target StrFile object to be loaded from the XLIFF file.

This must be set using the setStrFile method before starting the parse. When the parse is finished use the getStrFile method to retrieve the current target StrFile object.

Only the target string entries in the XLIFF are used to create the StrFile. All source string entries are ignored.


tokfile

private TokFile tokfile
The TokFile object to be loaded from the XLIFF file.

This must be set using the setTokFile method before starting the parse. When the parse is finished use the getTokFile method to retrieve the current TokFile object.


inXLIFFFile

private boolean inXLIFFFile
A state variable indicating that the processing is within an XLIFF tag.


inBody

private boolean inBody
A state variable indicating that the processing is within a Body tag.


inContext

private boolean inContext
A state variable indicating that the processing is within a Context tag.


contextFile

private boolean contextFile
A state variable indicating that the processing is for a Context File Name.


contextKey

private boolean contextKey
A state variable indicating that the processing is for a Context Key.


inContextGroup

private boolean inContextGroup
A state variable indicating that the processing is within a Context Group tag.


inFile

private boolean inFile
A state variable indicating that the processing is within a File tag.


inHeader

private boolean inHeader
A state variable indicating that the processing is within a Header tag.


inIntFile

private boolean inIntFile
A state variable indicating that the processing is within an Internal File tag.


inNote

private boolean inNote
A state variable indicating that the processing is within a Note tag.


inSkeletonFile

private boolean inSkeletonFile
A state variable indicating that the processing is within a SKL tag.


inSourceString

private boolean inSourceString
A state variable indicating that the processing is within a Source tag.


inTargetString

private boolean inTargetString
A state variable indicating that the processing is within a Target tag.


inTool

private boolean inTool
A state variable indicating that the processing is within a Tool tag.


inTU

private boolean inTU
A state variable indicating that the processing is within a Translation Unit tag.


aString

private java.lang.String aString
A work area to accumulate characters into a string.


fileName

private java.lang.String fileName
A work area to hold the file name for context information.


sourceLanguage

private java.lang.String sourceLanguage
A work area to hold the source language code.


targetLanguage

private java.lang.String targetLanguage
A work area to hold the target language code.


datatype

private java.lang.String datatype
A work area to hold the data type.


productName

private java.lang.String productName
A work area to hold the product name.


productVersion

private java.lang.String productVersion
A work area to hold the product version.


category

private java.lang.String category
A work area to hold the file category.


contextFileName

private java.lang.String contextFileName
A work area to hold the Context File Name.


contextKeyName

private java.lang.String contextKeyName
A work area to hold the Context Key Name.


tokenNumber

private java.lang.String tokenNumber
A work area to hold the Token Number.

Constructor Detail

LTXLIFFParser

public LTXLIFFParser()
Method Detail

setOSW

public void setOSW(java.io.OutputStreamWriter anOSW)
Sets output writer.

Parameters:
anOSW - an OutputStreamWriter specifying the output writer to use

setStrFile

public void setStrFile(StrFile sf)
Sets the StrFile object to load from the XLIFF file.

The StrFile object must be set before the parse is begun.

Parameters:
sf - a StrFile specifying the StrFile object to use
See Also:
getStrFile()

getStrFile

public StrFile getStrFile()
Returns the StrFile object that was loaded from the XLIFF file.

The StrFile object is not available outside this class unless this method is used to retrieve it.

Returns:
a StrFile representing the StrFile object that was loaded
See Also:
setStrFile(com.g11ntoolkit.strfile.StrFile)

setTokFile

public void setTokFile(TokFile tf)
Sets the TokFile object to load from the XLIFF file.

The TokFile object must be set before the parse is begun.

Parameters:
tf - a TokFile specifying the TokFile object to use
See Also:
getTokFile()

getTokFile

public TokFile getTokFile()
Returns the TokFile object that was loaded from the XLIFF file.

The TokFile object is not available outside this class unless this method is used to retrieve it.

Returns:
a TokFile representing the TokFile object that was loaded
See Also:
setTokFile(com.g11ntoolkit.tokfile.TokFile)

processingInstruction

public void processingInstruction(java.lang.String target,
                                  java.lang.String data)
                           throws org.xml.sax.SAXException
Processes a processing instruction (other than the XML declaration) when it is encountered.

Specified by:
processingInstruction in interface org.xml.sax.ContentHandler
Overrides:
processingInstruction in class org.xml.sax.helpers.DefaultHandler
Parameters:
target - a String specifying the target of the PI
data - a String containing all data sent to the PI.

This typically looks like one or more attribute value pairs.

Throws:
org.xml.sax.SAXException - when things go wrong

startPrefixMapping

public void startPrefixMapping(java.lang.String prefix,
                               java.lang.String uri)
                        throws org.xml.sax.SAXException
Processes the beginning of an XML Namespace prefix mapping.

Although this typically occurs within the root element of an XML document, it can occur at any point within the document. Note that a prefix mapping on an element triggers this callback before the callback for the actual element itself (startElement(java.lang.String, java.lang.String, java.lang.String, org.xml.sax.Attributes)) occurs.

Specified by:
startPrefixMapping in interface org.xml.sax.ContentHandler
Overrides:
startPrefixMapping in class org.xml.sax.helpers.DefaultHandler
Parameters:
prefix - a String specifying the prefix used for the namespace being reported
uri - a String specifying the URI for the namespace being reported
Throws:
org.xml.sax.SAXException - when things go wrong

endPrefixMapping

public void endPrefixMapping(java.lang.String prefix)
                      throws org.xml.sax.SAXException
Processes the end of a prefix mapping.

This is when the namespace reported in a startPrefixMapping(java.lang.String, java.lang.String) callback is no longer available.

Specified by:
endPrefixMapping in interface org.xml.sax.ContentHandler
Overrides:
endPrefixMapping in class org.xml.sax.helpers.DefaultHandler
Parameters:
prefix - a String specifying the prefix of the namespace being reported
Throws:
org.xml.sax.SAXException - when things go wrong

startDocument

public void startDocument()
                   throws org.xml.sax.SAXException
Processes the start of a Document parse.

Called once when the document is first opened. Precedes all callbacks in all SAX Handlers.

Specified by:
startDocument in interface org.xml.sax.ContentHandler
Overrides:
startDocument in class org.xml.sax.helpers.DefaultHandler
Throws:
org.xml.sax.SAXException - when things go wrong

endDocument

public void endDocument()
                 throws org.xml.sax.SAXException
Processes the end of a Document parse.

Called once when the document is closed. This occurs after all callbacks in all SAX Handlers.

Specified by:
endDocument in interface org.xml.sax.ContentHandler
Overrides:
endDocument in class org.xml.sax.helpers.DefaultHandler
Throws:
org.xml.sax.SAXException - when things go wrong

startElement

public void startElement(java.lang.String namespaceURI,
                         java.lang.String localName,
                         java.lang.String rawName,
                         org.xml.sax.Attributes attrs)
                  throws org.xml.sax.SAXException
Processes the occurrence of an actual element.

Includes the element's attributes, with the exception of XML vocabulary specific attributes, such as xmlns:[namespace prefix] and xsi:schemaLocation.

Code is added to this method for each element that has specific processing needs when it starts. Other code to handle the end of an element is in the endElement(java.lang.String, java.lang.String, java.lang.String) method.

Specified by:
startElement in interface org.xml.sax.ContentHandler
Overrides:
startElement in class org.xml.sax.helpers.DefaultHandler
Parameters:
namespaceURI - a String specifying the namespace URI this element is associated with, or an empty String
localName - a String specifying the name of the element (with no namespace prefix, if one is present)
rawName - a String specifying the XML 1.0 version of element name: [namespace prefix]:[localName]
attrs - an Attributes object containing a list of attributes for this element
Throws:
org.xml.sax.SAXException - when things go wrong

endElement

public void endElement(java.lang.String namespaceURI,
                       java.lang.String localName,
                       java.lang.String rawName)
                throws org.xml.sax.SAXException
Indicates the end of an element.

Shows that the </[element name]> tag has been reached. Note that the parser does not distinguish between empty elements and non-empty elements, so this occurs uniformly.

Code is added to this method for each element that has specific processing needs when it ends. Other code to handle the start of an element is in the startElement(java.lang.String, java.lang.String, java.lang.String, org.xml.sax.Attributes) method.

Specified by:
endElement in interface org.xml.sax.ContentHandler
Overrides:
endElement in class org.xml.sax.helpers.DefaultHandler
Parameters:
namespaceURI - a String specifying the namespace URI this element is associated with, or an empty String
localName - a String specifying the name of the element (with no namespace prefix, if one is present)
rawName - a String specifying the XML 1.0 version of element name: [namespace prefix]:[localName]
Throws:
org.xml.sax.SAXException - when things go wrong

characters

public void characters(char[] ch,
                       int start,
                       int length)
                throws org.xml.sax.SAXException
Processes character data (within an element).

Particular elements must know what to do with a String if one has been accumulated.

Specified by:
characters in interface org.xml.sax.ContentHandler
Overrides:
characters in class org.xml.sax.helpers.DefaultHandler
Parameters:
ch - a char[] specifying a character array that contains the character data
start - an int specifying the index in the array where the data starts
length - an int specifying the length of the string
Throws:
org.xml.sax.SAXException - when things go wrong

ignorableWhitespace

public void ignorableWhitespace(char[] ch,
                                int start,
                                int length)
                         throws org.xml.sax.SAXException
Processes whitespace that can be ignored in the originating document.

Typically invoked only when validation is occurring in the parsing process.

Specified by:
ignorableWhitespace in interface org.xml.sax.ContentHandler
Overrides:
ignorableWhitespace in class org.xml.sax.helpers.DefaultHandler
Parameters:
ch - a char[] specifying a character array that contains the character data
start - an int specifying the index in the array where the data starts
length - an int specifying the length of the ignorable white space
Throws:
org.xml.sax.SAXException - when things go wrong

eoT

protected void eoT()
Process end of tool element.


eoIF

protected void eoIF()
Process end of the internal file element.


eoSF

protected void eoSF()
Process end of skeleton file element.


eoHN

protected void eoHN()
Process end of note element when found in a header element.


eoH

protected void eoH()
Process end of header element.


eoC

protected void eoC()
Process end of context element.


eoCG

protected void eoCG()
Process end of context group element.


eoSS

protected void eoSS()
Process end of source string element.


eoTS

protected void eoTS()
             throws MalformedToken
Process of end of target string element.

MalformedToken

eoTUN

protected void eoTUN()
Process end of note element in a translation unit element.


eoTU

protected void eoTU()
             throws MalformedToken
Process end of translation unit element.

MalformedToken

eoB

protected void eoB()
            throws MalformedToken
Process end of body element.

MalformedToken

eoF

protected void eoF()
            throws org.xml.sax.SAXException,
                   MalformedToken
Process end of file element.

org.xml.sax.SAXException
MalformedToken