|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.xerces.parsers.XMLParser
org.apache.xerces.parsers.AbstractXMLDocumentParser
org.apache.xerces.parsers.AbstractSAXParser
file2xliff4j.HtmlImporter
public class HtmlImporter
The HtmlImporter is used to import HTML to (what else?) XLIFF.
Nested Class Summary |
---|
Nested classes/interfaces inherited from class org.apache.xerces.parsers.AbstractSAXParser |
---|
org.apache.xerces.parsers.AbstractSAXParser.AttributesProxy, org.apache.xerces.parsers.AbstractSAXParser.LocatorProxy |
Field Summary |
---|
Fields inherited from class org.apache.xerces.parsers.AbstractSAXParser |
---|
ALLOW_UE_AND_NOTATION_EVENTS, DECLARATION_HANDLER, DOM_NODE, fContentHandler, fDeclaredAttrs, fDeclHandler, fDocumentHandler, fDTDHandler, fLexicalHandler, fLexicalHandlerParameterEntities, fNamespaceContext, fNamespacePrefixes, fNamespaces, fParseInProgress, fQName, fResolveDTDURIs, fStandalone, fUseEntityResolver2, fVersion, fXMLNSURIs, LEXICAL_HANDLER, NAMESPACE_PREFIXES, NAMESPACES, STRING_INTERNING |
Fields inherited from class org.apache.xerces.parsers.AbstractXMLDocumentParser |
---|
fDocumentSource, fDTDContentModelSource, fDTDSource, fInDTD |
Fields inherited from class org.apache.xerces.parsers.XMLParser |
---|
ENTITY_RESOLVER, ERROR_HANDLER, fConfiguration |
Fields inherited from interface file2xliff4j.Converter |
---|
BLKSIZE, formatSuffix, skeletonSuffix, startXliff, stylesTSkeletonSuffix, tSkeletonSuffix, xliffSuffix, xmlDeclaration |
Fields inherited from interface org.apache.xerces.xni.XMLDTDHandler |
---|
CONDITIONAL_IGNORE, CONDITIONAL_INCLUDE |
Fields inherited from interface org.apache.xerces.xni.XMLDTDContentModelHandler |
---|
OCCURS_ONE_OR_MORE, OCCURS_ZERO_OR_MORE, OCCURS_ZERO_OR_ONE, SEPARATOR_CHOICE, SEPARATOR_SEQUENCE |
Constructor Summary | |
---|---|
HtmlImporter()
Constructor for the HTML importer. |
Method Summary | |
---|---|
boolean |
addTuDelimiter(java.lang.String tag)
Add an HTML tag to the set of tags that signal the start of a |
ConversionStatus |
convert(ConversionMode mode,
java.util.Locale language,
java.lang.String phaseName,
int maxPhase,
java.nio.charset.Charset nativeEncoding,
FileType nativeFileType,
java.lang.String nativeFileName,
java.lang.String baseDir,
Notifier notifier)
Deprecated. |
ConversionStatus |
convert(ConversionMode mode,
java.util.Locale language,
java.lang.String phaseName,
int maxPhase,
java.nio.charset.Charset nativeEncoding,
FileType nativeFileType,
java.lang.String nativeFileName,
java.lang.String baseDir,
Notifier notifier,
SegmentBoundary boundary,
java.io.StringWriter generatedFileName)
Convert an HTML file to XLIFF, creating xliff, skeleton and format files as output. |
ConversionStatus |
convert(ConversionMode mode,
java.util.Locale language,
java.lang.String phaseName,
int maxPhase,
java.nio.charset.Charset nativeEncoding,
FileType nativeFileType,
java.lang.String nativeFileName,
java.lang.String baseDir,
Notifier notifier,
SegmentBoundary boundary,
java.io.StringWriter generatedFileName,
java.util.Set<f2xutils.XMLTuXPath> skipList)
Convert an HTML file to XLIFF, creating xliff, skeleton and format files as output. |
java.lang.Object |
getConversionProperty(java.lang.String property)
Return an object representing a format-specific (and converter-specific) property. |
FileType |
getFileType()
Return the file type that this converter handles. |
java.lang.String[] |
getTuDelimiterList()
Remove an HTML tag from the set of tags that signal the start of a |
static java.nio.charset.Charset |
guessEncoding(java.lang.String htmlFileName)
Passed the name of an HTML file, look for a meta tag that indicates what encoding the file uses. |
boolean |
removeTuDelimiter(java.lang.String tag)
Remove an HTML tag from the set of tags that signal the start of a |
void |
setConversionProperty(java.lang.String property,
java.lang.Object value)
Set a format-specific property that might affect the way that the conversion occurs. |
Methods inherited from class org.apache.xerces.parsers.AbstractSAXParser |
---|
attributeDecl, characters, comment, doctypeDecl, elementDecl, endCDATA, endDocument, endDTD, endElement, endExternalSubset, endGeneralEntity, endNamespaceMapping, endParameterEntity, externalEntityDecl, getAttributePSVI, getAttributePSVIByName, getContentHandler, getDeclHandler, getDTDHandler, getElementPSVI, getEntityResolver, getErrorHandler, getFeature, getLexicalHandler, getProperty, ignorableWhitespace, internalEntityDecl, notationDecl, parse, parse, processingInstruction, reset, setContentHandler, setDeclHandler, setDocumentHandler, setDTDHandler, setEntityResolver, setErrorHandler, setFeature, setLexicalHandler, setLocale, setProperty, startCDATA, startDocument, startElement, startExternalSubset, startGeneralEntity, startNamespaceMapping, startParameterEntity, unparsedEntityDecl, xmlDecl |
Methods inherited from class org.apache.xerces.parsers.AbstractXMLDocumentParser |
---|
any, element, empty, emptyElement, endAttlist, endConditional, endContentModel, endGroup, getDocumentSource, getDTDContentModelSource, getDTDSource, ignoredCharacters, occurrence, pcdata, separator, setDocumentSource, setDTDContentModelSource, setDTDSource, startAttlist, startConditional, startContentModel, startDTD, startGroup, textDecl |
Methods inherited from class org.apache.xerces.parsers.XMLParser |
---|
parse |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public HtmlImporter()
Method Detail |
---|
public boolean addTuDelimiter(java.lang.String tag)
tag
- HTML tag to add to the set (Examples: "p", "h1", "dl", ...)
public java.lang.Object getConversionProperty(java.lang.String property)
getConversionProperty
in interface Converter
property
- The name of the property to return.
public FileType getFileType()
getFileType
in interface Converter
public java.lang.String[] getTuDelimiterList()
public ConversionStatus convert(ConversionMode mode, java.util.Locale language, java.lang.String phaseName, int maxPhase, java.nio.charset.Charset nativeEncoding, FileType nativeFileType, java.lang.String nativeFileName, java.lang.String baseDir, Notifier notifier, SegmentBoundary boundary, java.io.StringWriter generatedFileName) throws ConversionException
convert
in interface Converter
mode
- The mode of conversion (to or from XLIFF).language
- The language of the input file.phaseName
- The target phase-name. This value is ignored.maxPhase
- The maximum phase number. This value is ignored.nativeEncoding
- The encoding of the input file. This parameter tells
the converter how to interpret the bytes read from the input file, so
that it can convert them to UTF-8 for XLIFF. (Note: The value of this
parameter is only a "suggestion." This converter will make an attempt
to check the input file for a meta tag that indicates the encoding. If
found, it will use that value rather than the value of this parameter.nativeFileType
- The type of the native file. This value must be
"HTML". (Note: The value is stored in the the datatype attribute of the
XLIFF's file element.)nativeFileName
- The name of the input HTML file (without directory
prefix).baseDir
- The directory that contains the input HTML file--from which
we will read the input file. This is also the directory in which the output
xliff, skeleton and format files will be written. The output files will
be named as follows:
notifier
- Instance of a class that implements the Notifier
interface (to send notifications in case of conversion error).boundary
- The boundary on which to segment translation units (e.g.,
on paragraph or sentence boundaries)generatedFileName
- If non-null, the converter will write the name
of the file (without parent directories) to which the generated
XLIFF file was written.
ConversionException
- If a conversion exception is encountered.public ConversionStatus convert(ConversionMode mode, java.util.Locale language, java.lang.String phaseName, int maxPhase, java.nio.charset.Charset nativeEncoding, FileType nativeFileType, java.lang.String nativeFileName, java.lang.String baseDir, Notifier notifier, SegmentBoundary boundary, java.io.StringWriter generatedFileName, java.util.Set<f2xutils.XMLTuXPath> skipList) throws ConversionException
convert
in interface Converter
mode
- The mode of conversion (to or from XLIFF).language
- The language of the input file.phaseName
- The target phase-name. This value is ignored.maxPhase
- The maximum phase number. This value is ignored.nativeEncoding
- The encoding of the input file. This parameter tells
the converter how to interpret the bytes read from the input file, so
that it can convert them to UTF-8 for XLIFF. (Note: The value of this
parameter is only a "suggestion." This converter will make an attempt
to check the input file for a meta tag that indicates the encoding. If
found, it will use that value rather than the value of this parameter.nativeFileType
- The type of the native file. This value must be
"HTML". (Note: The value is stored in the the datatype attribute of the
XLIFF's file element.)nativeFileName
- The name of the input HTML file (without directory
prefix).baseDir
- The directory that contains the input HTML file--from which
we will read the input file. This is also the directory in which the output
xliff, skeleton and format files will be written. The output files will
be named as follows:
notifier
- Instance of a class that implements the Notifier
interface (to send notifications in case of conversion error).boundary
- The boundary on which to segment translation units (e.g.,
on paragraph or sentence boundaries)generatedFileName
- If non-null, the converter will write the name
of the file (without parent directories) to which the generated
XLIFF file was written.skipList
- (Not used by this converter.)
ConversionException
- If a conversion exception is encountered.@Deprecated public ConversionStatus convert(ConversionMode mode, java.util.Locale language, java.lang.String phaseName, int maxPhase, java.nio.charset.Charset nativeEncoding, FileType nativeFileType, java.lang.String nativeFileName, java.lang.String baseDir, Notifier notifier) throws ConversionException
convert
in interface Converter
mode
- The mode of conversion (to or from XLIFF).language
- The language of the input file.phaseName
- The target phase-name. This value is ignored.maxPhase
- The maximum phase number. This value is ignored.nativeEncoding
- The encoding of the input file. This parameter tells
the converter how to interpret the bytes read from the input file, so
that it can convert them to UTF-8 for XLIFF. (Note: The value of this
parameter is only a "suggestion." This converter will make an attempt
to check the input file for a meta tag that indicates the encoding. If
found, it will use that value rather than the value of this parameter.nativeFileType
- The type of the native file. This value must be
"HTML". (Note: The value is stored in the the datatype attribute of the
XLIFF's file element.)nativeFileName
- The name of the input HTML file (without directory
prefix).baseDir
- The directory that contains the input HTML file--from which
we will read the input file. This is also the directory in which the output
xliff, skeleton and format files will be written. The output files will
be named as follows:
notifier
- Instance of a class that implements the Notifier
interface (to send notifications in case of conversion error).
ConversionException
- If a conversion exception is encountered.public boolean removeTuDelimiter(java.lang.String tag)
tag
- HTML tag to remove from the set
public static java.nio.charset.Charset guessEncoding(java.lang.String htmlFileName) throws ConversionException
htmlFileName
- The name of an HTML file
ConversionException
- if an error is encountered.public void setConversionProperty(java.lang.String property, java.lang.Object value) throws ConversionException
Note: This converter needs no format-specific properties. If any are passed, they will be silently ignored.
setConversionProperty
in interface Converter
property
- The name of the propertyvalue
- The value of the property
ConversionException
- If the property isn't recognized (and if it matters).
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |