Package org.w3c.tidy
Class Tidy
java.lang.Object
org.w3c.tidy.Tidy
- All Implemented Interfaces:
Serializable
HTML parser and pretty printer.
- Version:
- $Revision: 1033 $ ($Author: aditsu $)
- Author:
- Dave Raggett dsr@w3.org , Andy Quick ac.quick@sympatico.ca (translation to Java), Fabrizio Giustina
- See Also:
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic Document
Creates an empty DOM Document.alt-text
- default text for alt attribute.boolean
ascii-chars
- convert quotes and dashes to nearest ASCII char.boolean
break-before-br - output newline before <br>.boolean
split
- create slides on each h2 element.Returns the actual configurationdoctype
- user specified doctype.boolean
drop-empty-paras
- discard empty p elements.boolean
drop-font-tags
- discard presentation tags.boolean
drop-proprietary-attributes
- discard proprietary attributes.boolean
getEmacs()
gnu-emacs
- if true format error output for GNU Emacs.boolean
enclose-block-text
- if true text in blocks is wrapped in <p>'s.boolean
enclose-text
- if true text at body is wrapped in <p>'s.Errfile - file name to write errors to.Errout - the error output stream.boolean
escape-cdata
-replace CDATA sections with escaped text.boolean
fix-backslash
- fix URLs by replacing \ with /.boolean
fix-bad-comments
- fix comments with adjacent hyphens.boolean
fix-uri
- output BODY content only.boolean
force-output
- output document even if errors were found.boolean
hide-comments
- hides all (real) comments in output.boolean
hide-endtags - suppress optional end tags.boolean
indent-attributes
- newline+indent before each attribute.boolean
indent-cdata
- indent CDATA sections.boolean
indent - indent content of appropriate tags.input-encoding
the character encoding used for input.boolean
join-classes
- join multiple class attributes.boolean
join-styles
- join multiple style attributes.boolean
keep-time
- if true last modified time is preserved.boolean
literal-attributes
- if true attributes may use newlines.boolean
logical-emphasis
- replace i by em and b by strong.boolean
lower-literals
- folds known attribute values to lower case.boolean
make-clean - remove Microsoft cruft.boolean
make-clean - remove presentational clutter.boolean
numeric-entities
- output entities other than the built-in HTML entities in the numeric rather than the named entity form.boolean
only-errors - if true normal output is suppressed.output-encoding
the character encoding used for output.int
ParseErrors - the number of errors that occurred in the most recent parse operation.int
ParseWarnings - the number of warnings that occurred in the most recent parse operation.boolean
print-body-only
- output BODY content only.boolean
getQuiet()
quiet - no 'Parsing X', guessed DTD or summary.boolean
quote-ampersand
- output naked ampersand as &.boolean
quote-marks
- output " marks as ".boolean
quote-nbsp
- output non-breaking space as entity.boolean
output-raw
- avoid mapping values > 127 to entities.int
repeated-attributes
- keep first or last duplicate attribute.boolean
replace-color
- replace hex color attribute values with names.int
show-errors
- number of errors to put out.boolean
show-warnings - show warnings? (errors are always shown).boolean
SmartIndent - does text/block level content effect indentation.int
indent-spaces
- default indentation.int
tab-size
- tab size in chars.boolean
tidy-mark
- add meta element indicating tidied doc.boolean
trim-empty-elements
- trim empty elements.boolean
uppercase-attributes - output attributes in upper case.boolean
uppercase-tags - output tags in upper case.boolean
word-2000
- draconian cleaning for Word2000.boolean
wrap-asp
- wrap within ASP pseudo elements.boolean
wrap-attributes
- wrap within attribute values.boolean
wrap-jste
- wrap within JSTE pseudo elements.int
wrap
- default wrap margin.boolean
wrap-php
- wrap within PHP pseudo elements.boolean
wrap-script-literals
- wrap within JavaScript string literals.boolean
wrap-sections
- wrap within <![ ...boolean
writeback - if true then output tidied markup.boolean
getXHTML()
output-xhtml - output extensible HTML.boolean
output-xml - create output as XML.boolean
getXmlPi()
add-xml-pi
- add <?xml?> for XML docs.boolean
assume-xml-procins
This option specifies if Tidy should change the parsing of processing instructions to require ?> as the terminator rather than >.boolean
add-xml-space
- if set to yes adds xml:space attr as needed.boolean
input-xml - treat input as XML.static void
Command line interface to parser and pretty printer.protected int
Main method, but returns the return code as an int instead of calling System.exit(code).parse
(InputStream in, OutputStream out) Reads from the given input and returns the root Node.parse
(InputStream in, Writer out) Reads from the given input and returns the root Node.parse
(Reader in, OutputStream out) Reads from the given input and returns the root Node.Reads from the given input and returns the root Node.parseDOM
(InputStream in, OutputStream out) Parses InputStream in and returns a DOM Document node.void
pprint
(Document doc, OutputStream out) Pretty-prints a DOM Document.void
pprint
(Node node, OutputStream out) Pretty-prints a DOM Node.void
setAltText
(String altText) alt-text
- default text for alt attribute.void
setAsciiChars
(boolean asciiChars) ascii-chars
- convert quotes and dashes to nearest ASCII char.void
setBreakBeforeBR
(boolean breakBeforeBR) break-before-br - output newline before <br>.void
setBurstSlides
(boolean burstSlides) split
- create slides on each h2 element.void
setConfigurationFromFile
(String filename) Sets the configuration from a configuration file.void
Sets the configuration from a properties object.void
setDocType
(String doctype) doctype
- user specified doctype.void
setDropEmptyParas
(boolean dropEmptyParas) drop-empty-paras
- discard empty p elements.void
setDropFontTags
(boolean dropFontTags) drop-font-tags
- discard presentation tags.void
setDropProprietaryAttributes
(boolean dropProprietaryAttributes) drop-proprietary-attributes
- discard proprietary attributes.void
setEmacs
(boolean emacs) gnu-emacs
- if true format error output for GNU Emacs.void
setEncloseBlockText
(boolean encloseBlockText) enclose-block-text
- if true text in blocks is wrapped in <p>'s.void
setEncloseText
(boolean encloseText) enclose-text
- if true text at body is wrapped in <p>'s.void
setErrfile
(String errfile) Errfile - file name to write errors to.void
setErrout
(PrintWriter out) void
setEscapeCdata
(boolean escapeCdata) escape-cdata
- replace CDATA sections with escaped text.void
setFixBackslash
(boolean fixBackslash) fix-backslash
- fix URLs by replacing \ with /.void
setFixComments
(boolean fixComments) fix-bad-comments
- fix comments with adjacent hyphens.void
setFixUri
(boolean fixUri) fix-uri
- fix uri references applying URI encoding if necessary.void
setForceOutput
(boolean forceOutput) force-output
- output document even if errors were found.void
setHideComments
(boolean hideComments) hide-comments
- hides all (real) comments in output.void
setHideEndTags
(boolean hideEndTags) hide-endtags - suppress optional end tags.void
setIndentAttributes
(boolean indentAttributes) indent-attributes
- newline+indent before each attribute.void
setIndentCdata
(boolean indentCdata) indent-cdata
- indent CDATA sections.void
setIndentContent
(boolean indentContent) indent - indent content of appropriate tags.void
setInputEncoding
(String encoding) input-encoding
the character encoding used for input.void
setInputStreamName
(String name) InputStreamName - the name of the input stream (printed in the header information).void
setJoinClasses
(boolean joinClasses) join-classes
- join multiple class attributes.void
setJoinStyles
(boolean joinStyles) join-styles
- join multiple style attributes.void
setKeepFileTimes
(boolean keepFileTimes) keep-time
- if true last modified time is preserved.void
setLiteralAttribs
(boolean literalAttribs) literal-attributes
- if true attributes may use newlines.void
setLogicalEmphasis
(boolean logicalEmphasis) logical-emphasis
- replace i by em and b by strong.void
setLowerLiterals
(boolean lowerLiterals) lower-literals
- folds known attribute values to lower case.void
setMakeBare
(boolean makeBare) make-bare - remove Microsoft cruft.void
setMakeClean
(boolean makeClean) make-clean - remove presentational clutter.void
setMessageListener
(TidyMessageListener listener) Attach a TidyMessageListener which will be notified for messages and errors.void
setNumEntities
(boolean numEntities) numeric-entities
- output entities other than the built-in HTML entities in the numeric rather than the named entity form.void
setOnlyErrors
(boolean onlyErrors) only-errors - if true normal output is suppressed.void
setOutputEncoding
(String encoding) output-encoding
the character encoding used for output.void
setPrintBodyOnly
(boolean bodyOnly) print-body-only
- output BODY content only.void
setQuiet
(boolean quiet) quiet - no 'Parsing X', guessed DTD or summary.void
setQuoteAmpersand
(boolean quoteAmpersand) quote-ampersand
- output naked ampersand as &.void
setQuoteMarks
(boolean quoteMarks) quote-marks
- output " marks as ".void
setQuoteNbsp
(boolean quoteNbsp) quote-nbsp
- output non-breaking space as entity.void
setRawOut
(boolean rawOut) output-raw
- avoid mapping values > 127 to entities.void
setRepeatedAttributes
(int repeatedAttributes) repeated-attributes
- keep first or last duplicate attribute.void
setReplaceColor
(boolean replaceColor) replace-color
- replace hex color attribute values with names.void
setShowErrors
(int showErrors) show-errors
- set the number of errors to put out.void
setShowWarnings
(boolean showWarnings) show-warnings - show warnings? (errors are always shown).void
setSmartIndent
(boolean smartIndent) SmartIndent - does text/block level content effect indentation.void
setSpaces
(int spaces) indent-spaces
- default indentation.void
setTabsize
(int tabsize) tab-size
- tab size in chars.void
setTidyMark
(boolean tidyMark) tidy-mark
- add meta element indicating tidied doc.void
setTrimEmptyElements
(boolean trimEmpty) trim-empty-elements
- trim empty elements.void
setUpperCaseAttrs
(boolean upperCaseAttrs) uppercase-attributes - output attributes in upper case.void
setUpperCaseTags
(boolean upperCaseTags) uppercase-tags - output tags in upper case.void
setWord2000
(boolean word2000) word-2000
- draconian cleaning for Word2000.void
setWrapAsp
(boolean wrapAsp) wrap-asp
- wrap within ASP pseudo elements.void
setWrapAttVals
(boolean wrapAttVals) wrap-attributes
- wrap within attribute values.void
setWrapJste
(boolean wrapJste) wrap-jste
- wrap within JSTE pseudo elements.void
setWraplen
(int wraplen) wrap
- default wrap margin.void
setWrapPhp
(boolean wrapPhp) wrap-php
- wrap within PHP pseudo elements.void
setWrapScriptlets
(boolean wrapScriptlets) wrap-script-literals
- wrap within JavaScript string literals.void
setWrapSection
(boolean wrapSection) wrap-sections
- wrap within <![ ...void
setWriteback
(boolean writeback) writeback - if true then output tidied markup.void
setXHTML
(boolean xhtml) output-xhtml - output extensible HTML.void
setXmlOut
(boolean xmlOut) output-xml - create output as XML.void
setXmlPi
(boolean xmlPi) add-xml-pi
- add <?xml?> for XML docs.void
setXmlPIs
(boolean xmlPIs) assume-xml-procins
This option specifies if Tidy should change the parsing of processing instructions to require ?> as the terminator rather than >.void
setXmlSpace
(boolean xmlSpace) add-xml-space
- if set to yes adds xml:space attr as needed.void
setXmlTags
(boolean xmlTags) input-xml - treat input as XML.
-
Constructor Details
-
Tidy
public Tidy()Instantiates a new Tidy instance. It's reccomended that a new instance is used at each parsing.
-
-
Method Details
-
getConfiguration
Returns the actual configuration- Returns:
- tidy configuration
-
getStderr
-
getParseErrors
public int getParseErrors()ParseErrors - the number of errors that occurred in the most recent parse operation.- Returns:
- number of errors that occurred in the most recent parse operation.
-
getParseWarnings
public int getParseWarnings()ParseWarnings - the number of warnings that occurred in the most recent parse operation.- Returns:
- number of warnings that occurred in the most recent parse operation.
-
setInputStreamName
InputStreamName - the name of the input stream (printed in the header information).- Parameters:
name
- input stream name
-
getInputStreamName
-
getErrout
Errout - the error output stream.- Returns:
- error output stream.
-
setErrout
-
setConfigurationFromFile
Sets the configuration from a configuration file.- Parameters:
filename
- configuration file name/path.
-
setConfigurationFromProps
Sets the configuration from a properties object.- Parameters:
props
- Properties object
-
createEmptyDocument
Creates an empty DOM Document.- Returns:
- a new org.w3c.dom.Document
-
parse
Reads from the given input and returns the root Node. If out is non-null, pretty prints to out. Warning: caller is responsible for calling close() on input and output after calling this method.- Parameters:
in
- inputout
- optional destination for pretty-printed document- Returns:
- parsed org.w3c.tidy.Node
-
parse
Reads from the given input and returns the root Node. If out is non-null, pretty prints to out. Warning: caller is responsible for calling close() on input and output after calling this method.- Parameters:
in
- inputout
- optional destination for pretty-printed document- Returns:
- parsed org.w3c.tidy.Node
-
parse
Reads from the given input and returns the root Node. If out is non-null, pretty prints to out. Warning: caller is responsible for calling close() on input and output after calling this method.- Parameters:
in
- inputout
- optional destination for pretty-printed document- Returns:
- parsed org.w3c.tidy.Node
-
parse
Reads from the given input and returns the root Node. If out is non-null, pretty prints to out. Warning: caller is responsible for calling close() on input and output after calling this method.- Parameters:
in
- inputout
- optional destination for pretty-printed document- Returns:
- parsed org.w3c.tidy.Node
-
parseDOM
Parses InputStream in and returns a DOM Document node. If out is non-null, pretty prints to OutputStream out.- Parameters:
in
- input streamout
- optional output stream- Returns:
- parsed org.w3c.dom.Document
-
parseDOM
-
pprint
Pretty-prints a DOM Document. Must be an instance of org.w3c.tidy.DOMDocumentImpl. Caller is responsible for closing the outputStream after calling this method.- Parameters:
doc
- org.w3c.dom.Documentout
- output stream
-
pprint
Pretty-prints a DOM Node. Caller is responsible for closing the outputStream after calling this method.- Parameters:
node
- org.w3c.dom.Node. Must be an instance of org.w3c.tidy.DOMNodeImpl.out
- output stream
-
main
Command line interface to parser and pretty printer.- Parameters:
argv
- command line parameters
-
mainExec
Main method, but returns the return code as an int instead of calling System.exit(code). Needed for testing main method without shutting down tests.- Parameters:
argv
- command line parameters- Returns:
- return code
-
setMessageListener
Attach a TidyMessageListener which will be notified for messages and errors.- Parameters:
listener
- TidyMessageListener implementation
-
setSpaces
public void setSpaces(int spaces) indent-spaces
- default indentation.- Parameters:
spaces
- number of spaces used for indentation- See Also:
-
getSpaces
public int getSpaces()indent-spaces
- default indentation.- Returns:
- number of spaces used for indentation
- See Also:
-
setWraplen
public void setWraplen(int wraplen) wrap
- default wrap margin.- Parameters:
wraplen
- default wrap margin- See Also:
-
getWraplen
public int getWraplen()wrap
- default wrap margin.- Returns:
- default wrap margin
- See Also:
-
setTabsize
public void setTabsize(int tabsize) tab-size
- tab size in chars.- Parameters:
tabsize
- tab size in chars- See Also:
-
getTabsize
public int getTabsize()tab-size
- tab size in chars.- Returns:
- tab size in chars
- See Also:
-
setErrfile
Errfile - file name to write errors to.- Parameters:
errfile
- file name to write errors to- See Also:
-
getErrfile
Errfile - file name to write errors to.- Returns:
- error file name
- See Also:
-
setWriteback
public void setWriteback(boolean writeback) writeback - if true then output tidied markup. NOTE: this property is ignored when parsing from an InputStream.- Parameters:
writeback
-true
= output tidied markup- See Also:
-
getWriteback
public boolean getWriteback()writeback - if true then output tidied markup. NOTE: this property is ignored when parsing from an InputStream.- Returns:
true
if tidy will output tidied markup in input file- See Also:
-
setOnlyErrors
public void setOnlyErrors(boolean onlyErrors) only-errors - if true normal output is suppressed.- Parameters:
onlyErrors
- iftrue
normal output is suppressed.- See Also:
-
getOnlyErrors
public boolean getOnlyErrors()only-errors - if true normal output is suppressed.- Returns:
true
if normal output is suppressed.- See Also:
-
setShowWarnings
public void setShowWarnings(boolean showWarnings) show-warnings - show warnings? (errors are always shown).- Parameters:
showWarnings
- iffalse
warnings are not shown- See Also:
-
getShowWarnings
public boolean getShowWarnings()show-warnings - show warnings? (errors are always shown).- Returns:
false
if warnings are not shown- See Also:
-
setQuiet
public void setQuiet(boolean quiet) quiet - no 'Parsing X', guessed DTD or summary.- Parameters:
quiet
-true
= don't output summary, warnings or errors- See Also:
-
getQuiet
public boolean getQuiet()quiet - no 'Parsing X', guessed DTD or summary.- Returns:
true
if tidy will not output summary, warnings or errors- See Also:
-
setIndentContent
public void setIndentContent(boolean indentContent) indent - indent content of appropriate tags.- Parameters:
indentContent
- indent content of appropriate tags- See Also:
-
getIndentContent
public boolean getIndentContent()indent - indent content of appropriate tags.- Returns:
true
if tidy will indent content of appropriate tags- See Also:
-
setSmartIndent
public void setSmartIndent(boolean smartIndent) SmartIndent - does text/block level content effect indentation.- Parameters:
smartIndent
-true
if text/block level content should effect indentation- See Also:
-
getSmartIndent
public boolean getSmartIndent()SmartIndent - does text/block level content effect indentation.- Returns:
true
if text/block level content should effect indentation- See Also:
-
setHideEndTags
public void setHideEndTags(boolean hideEndTags) hide-endtags - suppress optional end tags.- Parameters:
hideEndTags
-true
= suppress optional end tags- See Also:
-
getHideEndTags
public boolean getHideEndTags()hide-endtags - suppress optional end tags.- Returns:
true
if tidy will suppress optional end tags- See Also:
-
setXmlTags
public void setXmlTags(boolean xmlTags) input-xml - treat input as XML.- Parameters:
xmlTags
-true
if tidy should treat input as XML- See Also:
-
getXmlTags
public boolean getXmlTags()input-xml - treat input as XML.- Returns:
true
if tidy will treat input as XML- See Also:
-
setXmlOut
public void setXmlOut(boolean xmlOut) output-xml - create output as XML.- Parameters:
xmlOut
-true
if tidy should create output as xml- See Also:
-
getXmlOut
public boolean getXmlOut()output-xml - create output as XML.- Returns:
true
if tidy will create output as xml- See Also:
-
setXHTML
public void setXHTML(boolean xhtml) output-xhtml - output extensible HTML.- Parameters:
xhtml
-true
if tidy should output XHTML- See Also:
-
getXHTML
public boolean getXHTML()output-xhtml - output extensible HTML.- Returns:
true
if tidy will output XHTML- See Also:
-
setUpperCaseTags
public void setUpperCaseTags(boolean upperCaseTags) uppercase-tags - output tags in upper case.- Parameters:
upperCaseTags
-true
if tidy should output tags in upper case (default is lowercase)- See Also:
-
getUpperCaseTags
public boolean getUpperCaseTags()uppercase-tags - output tags in upper case.- Returns:
true
if tidy should will tags in upper case- See Also:
-
setUpperCaseAttrs
public void setUpperCaseAttrs(boolean upperCaseAttrs) uppercase-attributes - output attributes in upper case.- Parameters:
upperCaseAttrs
-true
if tidy should output attributes in upper case (default is lowercase)- See Also:
-
getUpperCaseAttrs
public boolean getUpperCaseAttrs()uppercase-attributes - output attributes in upper case.- Returns:
true
if tidy should will attributes in upper case- See Also:
-
setMakeClean
public void setMakeClean(boolean makeClean) make-clean - remove presentational clutter.- Parameters:
makeClean
- true to remove presentational clutter- See Also:
-
getMakeClean
public boolean getMakeClean()make-clean - remove presentational clutter.- Returns:
- true if tidy will remove presentational clutter
- See Also:
-
setMakeBare
public void setMakeBare(boolean makeBare) make-bare - remove Microsoft cruft.- Parameters:
makeBare
- true to remove Microsoft cruft- See Also:
-
getMakeBare
public boolean getMakeBare()make-clean - remove Microsoft cruft.- Returns:
- true if tidy will remove Microsoft cruft
- See Also:
-
setBreakBeforeBR
public void setBreakBeforeBR(boolean breakBeforeBR) break-before-br - output newline before <br>.- Parameters:
breakBeforeBR
-true
if tidy should output a newline before <br>- See Also:
-
getBreakBeforeBR
public boolean getBreakBeforeBR()break-before-br - output newline before <br>.- Returns:
true
if tidy will output a newline before <br>- See Also:
-
setBurstSlides
public void setBurstSlides(boolean burstSlides) split
- create slides on each h2 element.- Parameters:
burstSlides
-true
if tidy should create slides on each h2 element- See Also:
-
getBurstSlides
public boolean getBurstSlides()split
- create slides on each h2 element.- Returns:
true
if tidy will create slides on each h2 element- See Also:
-
setNumEntities
public void setNumEntities(boolean numEntities) numeric-entities
- output entities other than the built-in HTML entities in the numeric rather than the named entity form.- Parameters:
numEntities
-true
if tidy should output entities in the numeric form.- See Also:
-
getNumEntities
public boolean getNumEntities()numeric-entities
- output entities other than the built-in HTML entities in the numeric rather than the named entity form.- Returns:
true
if tidy will output entities in the numeric form.- See Also:
-
setQuoteMarks
public void setQuoteMarks(boolean quoteMarks) quote-marks
- output " marks as ".- Parameters:
quoteMarks
-true
if tidy should output " marks as "- See Also:
-
getQuoteMarks
public boolean getQuoteMarks()quote-marks
- output " marks as ".- Returns:
true
if tidy will output " marks as "- See Also:
-
setQuoteNbsp
public void setQuoteNbsp(boolean quoteNbsp) quote-nbsp
- output non-breaking space as entity.- Parameters:
quoteNbsp
-true
if tidy should output non-breaking space as entity- See Also:
-
getQuoteNbsp
public boolean getQuoteNbsp()quote-nbsp
- output non-breaking space as entity.- Returns:
true
if tidy will output non-breaking space as entity- See Also:
-
setQuoteAmpersand
public void setQuoteAmpersand(boolean quoteAmpersand) quote-ampersand
- output naked ampersand as &.- Parameters:
quoteAmpersand
-true
if tidy should output naked ampersand as &- See Also:
-
getQuoteAmpersand
public boolean getQuoteAmpersand()quote-ampersand
- output naked ampersand as &.- Returns:
true
if tidy will output naked ampersand as &- See Also:
-
setWrapAttVals
public void setWrapAttVals(boolean wrapAttVals) wrap-attributes
- wrap within attribute values.- Parameters:
wrapAttVals
-true
if tidy should wrap within attribute values- See Also:
-
getWrapAttVals
public boolean getWrapAttVals()wrap-attributes
- wrap within attribute values.- Returns:
true
if tidy will wrap within attribute values- See Also:
-
setWrapScriptlets
public void setWrapScriptlets(boolean wrapScriptlets) wrap-script-literals
- wrap within JavaScript string literals.- Parameters:
wrapScriptlets
-true
if tidy should wrap within JavaScript string literals- See Also:
-
getWrapScriptlets
public boolean getWrapScriptlets()wrap-script-literals
- wrap within JavaScript string literals.- Returns:
true
if tidy will wrap within JavaScript string literals- See Also:
-
setWrapSection
public void setWrapSection(boolean wrapSection) wrap-sections
- wrap within <![ ... ]> section tags- Parameters:
wrapSection
-true
if tidy should wrap within <![ ... ]> section tags- See Also:
-
getWrapSection
public boolean getWrapSection()wrap-sections
- wrap within <![ ... ]> section tags- Returns:
true
if tidy will wrap within <![ ... ]> section tags- See Also:
-
setAltText
alt-text
- default text for alt attribute.- Parameters:
altText
- default text for alt attribute- See Also:
-
getAltText
alt-text
- default text for alt attribute.- Returns:
- default text for alt attribute
- See Also:
-
setXmlPi
public void setXmlPi(boolean xmlPi) add-xml-pi
- add <?xml?> for XML docs.- Parameters:
xmlPi
-true
if tidy should add <?xml?> for XML docs- See Also:
-
getXmlPi
public boolean getXmlPi()add-xml-pi
- add <?xml?> for XML docs.- Returns:
true
if tidy will add <?xml?> for XML docs- See Also:
-
setDropFontTags
public void setDropFontTags(boolean dropFontTags) drop-font-tags
- discard presentation tags.- Parameters:
dropFontTags
-true
if tidy should discard presentation tags- See Also:
-
getDropFontTags
public boolean getDropFontTags()drop-font-tags
- discard presentation tags.- Returns:
true
if tidy will discard presentation tags- See Also:
-
setDropProprietaryAttributes
public void setDropProprietaryAttributes(boolean dropProprietaryAttributes) drop-proprietary-attributes
- discard proprietary attributes.- Parameters:
dropProprietaryAttributes
-true
if tidy should discard proprietary attributes- See Also:
-
getDropProprietaryAttributes
public boolean getDropProprietaryAttributes()drop-proprietary-attributes
- discard proprietary attributes.- Returns:
true
if tidy will discard proprietary attributes- See Also:
-
setDropEmptyParas
public void setDropEmptyParas(boolean dropEmptyParas) drop-empty-paras
- discard empty p elements.- Parameters:
dropEmptyParas
-true
if tidy should discard empty p elements- See Also:
-
getDropEmptyParas
public boolean getDropEmptyParas()drop-empty-paras
- discard empty p elements.- Returns:
true
if tidy will discard empty p elements- See Also:
-
setFixComments
public void setFixComments(boolean fixComments) fix-bad-comments
- fix comments with adjacent hyphens.- Parameters:
fixComments
-true
if tidy should fix comments with adjacent hyphens- See Also:
-
getFixComments
public boolean getFixComments()fix-bad-comments
- fix comments with adjacent hyphens.- Returns:
true
if tidy will fix comments with adjacent hyphens- See Also:
-
setWrapAsp
public void setWrapAsp(boolean wrapAsp) wrap-asp
- wrap within ASP pseudo elements.- Parameters:
wrapAsp
-true
if tidy should wrap within ASP pseudo elements- See Also:
-
getWrapAsp
public boolean getWrapAsp()wrap-asp
- wrap within ASP pseudo elements.- Returns:
true
if tidy will wrap within ASP pseudo elements- See Also:
-
setWrapJste
public void setWrapJste(boolean wrapJste) wrap-jste
- wrap within JSTE pseudo elements.- Parameters:
wrapJste
-true
if tidy should wrap within JSTE pseudo elements- See Also:
-
getWrapJste
public boolean getWrapJste()wrap-jste
- wrap within JSTE pseudo elements.- Returns:
true
if tidy will wrap within JSTE pseudo elements- See Also:
-
setWrapPhp
public void setWrapPhp(boolean wrapPhp) wrap-php
- wrap within PHP pseudo elements.- Parameters:
wrapPhp
-true
if tidy should wrap within PHP pseudo elements- See Also:
-
getWrapPhp
public boolean getWrapPhp()wrap-php
- wrap within PHP pseudo elements.- Returns:
true
if tidy will wrap within PHP pseudo elements- See Also:
-
setFixBackslash
public void setFixBackslash(boolean fixBackslash) fix-backslash
- fix URLs by replacing \ with /.- Parameters:
fixBackslash
-true
if tidy should fix URLs by replacing \ with /- See Also:
-
getFixBackslash
public boolean getFixBackslash()fix-backslash
- fix URLs by replacing \ with /.- Returns:
true
if tidy will fix URLs by replacing \ with /- See Also:
-
setIndentAttributes
public void setIndentAttributes(boolean indentAttributes) indent-attributes
- newline+indent before each attribute.- Parameters:
indentAttributes
-true
if tidy should output a newline+indent before each attribute- See Also:
-
getIndentAttributes
public boolean getIndentAttributes()indent-attributes
- newline+indent before each attribute.- Returns:
true
if tidy will output a newline+indent before each attribute- See Also:
-
setDocType
doctype
- user specified doctype.- Parameters:
doctype
-omit | auto | strict | loose | fpi
where the fpi is a string similar to "-//ACME//DTD HTML 3.14159//EN" Note: for fpi include the double-quotes in the string.- See Also:
-
getDocType
doctype
- user specified doctype.- Returns:
omit | auto | strict | loose | fpi
where the fpi is a string similar to "-//ACME//DTD HTML 3.14159//EN" Note: for fpi include the double-quotes in the string.- See Also:
-
setLogicalEmphasis
public void setLogicalEmphasis(boolean logicalEmphasis) logical-emphasis
- replace i by em and b by strong.- Parameters:
logicalEmphasis
-true
if tidy should replace i by em and b by strong- See Also:
-
getLogicalEmphasis
public boolean getLogicalEmphasis()logical-emphasis
- replace i by em and b by strong.- Returns:
true
if tidy will replace i by em and b by strong- See Also:
-
setXmlPIs
public void setXmlPIs(boolean xmlPIs) assume-xml-procins
This option specifies if Tidy should change the parsing of processing instructions to require ?> as the terminator rather than >. This option is automatically set if the input is in XML.- Parameters:
xmlPIs
-true
if tidy should expect a ?> at the end of processing instructions- See Also:
-
getXmlPIs
public boolean getXmlPIs()assume-xml-procins
This option specifies if Tidy should change the parsing of processing instructions to require ?> as the terminator rather than >. This option is automatically set if the input is in XML.- Returns:
true
if tidy will expect a ?> at the end of processing instructions- See Also:
-
setEncloseText
public void setEncloseText(boolean encloseText) enclose-text
- if true text at body is wrapped in <p>'s.- Parameters:
encloseText
-true
if tidy should wrap text at body in <p>'s.- See Also:
-
getEncloseText
public boolean getEncloseText()enclose-text
- if true text at body is wrapped in <p>'s.- Returns:
true
if tidy will wrap text at body in <p>'s.- See Also:
-
setEncloseBlockText
public void setEncloseBlockText(boolean encloseBlockText) enclose-block-text
- if true text in blocks is wrapped in <p>'s.- Parameters:
encloseBlockText
-true
if tidy should wrap text text in blocks in <p>'s.- See Also:
-
getEncloseBlockText
public boolean getEncloseBlockText()enclose-block-text
- if true text in blocks is wrapped in <p>'s. returntrue
if tidy should will text text in blocks in <p>'s.- See Also:
-
setWord2000
public void setWord2000(boolean word2000) word-2000
- draconian cleaning for Word2000.- Parameters:
word2000
-true
if tidy should clean word2000 documents- See Also:
-
getWord2000
public boolean getWord2000()word-2000
- draconian cleaning for Word2000.- Returns:
true
if tidy will clean word2000 documents- See Also:
-
setTidyMark
public void setTidyMark(boolean tidyMark) tidy-mark
- add meta element indicating tidied doc.- Parameters:
tidyMark
-true
if tidy should add meta element indicating tidied doc- See Also:
-
getTidyMark
public boolean getTidyMark()tidy-mark
- add meta element indicating tidied doc.- Returns:
true
if tidy will add meta element indicating tidied doc- See Also:
-
setXmlSpace
public void setXmlSpace(boolean xmlSpace) add-xml-space
- if set to yes adds xml:space attr as needed.- Parameters:
xmlSpace
-true
if tidy should add xml:space attr as needed- See Also:
-
getXmlSpace
public boolean getXmlSpace()add-xml-space
- if set to yes adds xml:space attr as needed.- Returns:
true
if tidy will add xml:space attr as needed- See Also:
-
setEmacs
public void setEmacs(boolean emacs) gnu-emacs
- if true format error output for GNU Emacs.- Parameters:
emacs
-true
if tidy should format error output for GNU Emacs- See Also:
-
getEmacs
public boolean getEmacs()gnu-emacs
- if true format error output for GNU Emacs.- Returns:
true
if tidy will format error output for GNU Emacs- See Also:
-
setLiteralAttribs
public void setLiteralAttribs(boolean literalAttribs) literal-attributes
- if true attributes may use newlines.- Parameters:
literalAttribs
-true
if attributes may use newlines- See Also:
-
getLiteralAttribs
public boolean getLiteralAttribs()literal-attributes
- if true attributes may use newlines.- Returns:
true
if attributes may use newlines- See Also:
-
setPrintBodyOnly
public void setPrintBodyOnly(boolean bodyOnly) print-body-only
- output BODY content only.- Parameters:
bodyOnly
- true = print only the document body- See Also:
-
getPrintBodyOnly
public boolean getPrintBodyOnly()print-body-only
- output BODY content only.- Returns:
- true if tidy will print only the document body
-
setFixUri
public void setFixUri(boolean fixUri) fix-uri
- fix uri references applying URI encoding if necessary.- Parameters:
fixUri
- true = fix uri references- See Also:
-
getFixUri
public boolean getFixUri()fix-uri
- output BODY content only.- Returns:
- true if tidy will fix uri references
-
setLowerLiterals
public void setLowerLiterals(boolean lowerLiterals) lower-literals
- folds known attribute values to lower case.- Parameters:
lowerLiterals
- true = folds known attribute values to lower case- See Also:
-
getLowerLiterals
public boolean getLowerLiterals()lower-literals
- folds known attribute values to lower case.- Returns:
- true if tidy will folds known attribute values to lower case
-
setHideComments
public void setHideComments(boolean hideComments) hide-comments
- hides all (real) comments in output.- Parameters:
hideComments
- true = hides all comments in output- See Also:
-
getHideComments
public boolean getHideComments()hide-comments
- hides all (real) comments in output.- Returns:
- true if tidy will hide all comments in output
-
setIndentCdata
public void setIndentCdata(boolean indentCdata) indent-cdata
- indent CDATA sections.- Parameters:
indentCdata
- true = indent CDATA sections- See Also:
-
getIndentCdata
public boolean getIndentCdata()indent-cdata
- indent CDATA sections.- Returns:
- true if tidy will indent CDATA sections
-
setForceOutput
public void setForceOutput(boolean forceOutput) force-output
- output document even if errors were found.- Parameters:
forceOutput
- true = output document even if errors were found- See Also:
-
getForceOutput
public boolean getForceOutput()force-output
- output document even if errors were found.- Returns:
- true if tidy will output document even if errors were found
-
setShowErrors
public void setShowErrors(int showErrors) show-errors
- set the number of errors to put out.- Parameters:
showErrors
- number of errors to put out- See Also:
-
getShowErrors
public int getShowErrors()show-errors
- number of errors to put out.- Returns:
- the number of errors tidy will put out
-
setAsciiChars
public void setAsciiChars(boolean asciiChars) ascii-chars
- convert quotes and dashes to nearest ASCII char.- Parameters:
asciiChars
- true = convert quotes and dashes to nearest ASCII char- See Also:
-
getAsciiChars
public boolean getAsciiChars()ascii-chars
- convert quotes and dashes to nearest ASCII char.- Returns:
- true if tidy will convert quotes and dashes to nearest ASCII char
-
setJoinClasses
public void setJoinClasses(boolean joinClasses) join-classes
- join multiple class attributes.- Parameters:
joinClasses
- true = join multiple class attributes- See Also:
-
getJoinClasses
public boolean getJoinClasses()join-classes
- join multiple class attributes.- Returns:
- true if tidy will join multiple class attributes
-
setJoinStyles
public void setJoinStyles(boolean joinStyles) join-styles
- join multiple style attributes.- Parameters:
joinStyles
- true = join multiple style attributes- See Also:
-
getJoinStyles
public boolean getJoinStyles()join-styles
- join multiple style attributes.- Returns:
- true if tidy will join multiple style attributes
-
setTrimEmptyElements
public void setTrimEmptyElements(boolean trimEmpty) trim-empty-elements
- trim empty elements.- Parameters:
trim
- -empty-elements true = trim empty elements- See Also:
-
getTrimEmptyElements
public boolean getTrimEmptyElements()trim-empty-elements
- trim empty elements.- Returns:
- true if tidy will trim empty elements
-
setReplaceColor
public void setReplaceColor(boolean replaceColor) replace-color
- replace hex color attribute values with names.- Parameters:
replaceColor
- true = replace hex color attribute values with names- See Also:
-
getReplaceColor
public boolean getReplaceColor()replace-color
- replace hex color attribute values with names.- Returns:
- true if tidy will replace hex color attribute values with names
-
setEscapeCdata
public void setEscapeCdata(boolean escapeCdata) escape-cdata
- replace CDATA sections with escaped text.- Parameters:
escapeCdata
- true = replace CDATA sections with escaped text- See Also:
-
getEscapeCdata
public boolean getEscapeCdata()escape-cdata
-replace CDATA sections with escaped text.- Returns:
- true if tidy will replace CDATA sections with escaped text
-
setRepeatedAttributes
public void setRepeatedAttributes(int repeatedAttributes) repeated-attributes
- keep first or last duplicate attribute.- Parameters:
repeatedAttributes
-Configuration.KEEP_FIRST | Configuration.KEEP_LAST
- See Also:
-
getRepeatedAttributes
public int getRepeatedAttributes()repeated-attributes
- keep first or last duplicate attribute.- Returns:
Configuration.KEEP_FIRST | Configuration.KEEP_LAST
-
setKeepFileTimes
public void setKeepFileTimes(boolean keepFileTimes) keep-time
- if true last modified time is preserved.- Parameters:
keepFileTimes
-true
if tidy should preserved last modified time in input file.- See Also:
- To do:
- this is NOT supported at this time.
-
getKeepFileTimes
public boolean getKeepFileTimes()keep-time
- if true last modified time is preserved.- Returns:
true
if tidy will preserved last modified time in input file.- See Also:
- To do:
- this is NOT supported at this time.
-
setRawOut
public void setRawOut(boolean rawOut) output-raw
- avoid mapping values > 127 to entities. This has the same effect of specifying a "raw" encoding in the original version of tidy.- Parameters:
rawOut
- avoid mapping values > 127 to entities- See Also:
-
getRawOut
public boolean getRawOut()output-raw
- avoid mapping values > 127 to entities.- Returns:
true
if tidy will not map values > 127 to entities- See Also:
-
setInputEncoding
input-encoding
the character encoding used for input.- Parameters:
encoding
- a valid java encoding name
-
getInputEncoding
input-encoding
the character encoding used for input.- Returns:
- the java name of the encoding currently used for input
-
setOutputEncoding
output-encoding
the character encoding used for output.- Parameters:
encoding
- a valid java encoding name
-
getOutputEncoding
output-encoding
the character encoding used for output.- Returns:
- the java name of the encoding currently used for output
-