Package org.w3c.tidy

Class Tidy

java.lang.Object
org.w3c.tidy.Tidy
All Implemented Interfaces:
Serializable

public class Tidy extends Object implements Serializable
HTML parser and pretty printer.
Version:
$Revision: 1033 $ ($Author: aditsu $)
Author:
Dave Raggett dsr@w3.org , Andy Quick ac.quick@sympatico.ca (translation to Java), Fabrizio Giustina
See Also:
  • Constructor Summary

    Constructors
    Constructor
    Description
    Instantiates a new Tidy instance.
  • Method Summary

    Modifier and Type
    Method
    Description
    static Document
    Creates an empty DOM Document.
    alt-text- default text for alt attribute.
    boolean
    ascii-chars- convert quotes and dashes to nearest ASCII char.
    boolean
    break-before-br - output newline before <br>.
    boolean
    split- create slides on each h2 element.
    Returns the actual configuration
    doctype- user specified doctype.
    boolean
    drop-empty-paras- discard empty p elements.
    boolean
    drop-font-tags- discard presentation tags.
    boolean
    drop-proprietary-attributes- discard proprietary attributes.
    boolean
    gnu-emacs- if true format error output for GNU Emacs.
    boolean
    enclose-block-text- if true text in blocks is wrapped in <p>'s.
    boolean
    enclose-text- if true text at body is wrapped in <p>'s.
    Errfile - file name to write errors to.
    Errout - the error output stream.
    boolean
    escape-cdata -replace CDATA sections with escaped text.
    boolean
    fix-backslash- fix URLs by replacing \ with /.
    boolean
    fix-bad-comments- fix comments with adjacent hyphens.
    boolean
    fix-uri- output BODY content only.
    boolean
    force-output- output document even if errors were found.
    boolean
    hide-comments- hides all (real) comments in output.
    boolean
    hide-endtags - suppress optional end tags.
    boolean
    indent-attributes- newline+indent before each attribute.
    boolean
    indent-cdata- indent CDATA sections.
    boolean
    indent - indent content of appropriate tags.
    input-encoding the character encoding used for input.
     
    boolean
    join-classes- join multiple class attributes.
    boolean
    join-styles- join multiple style attributes.
    boolean
    keep-time- if true last modified time is preserved.
    boolean
    literal-attributes- if true attributes may use newlines.
    boolean
    logical-emphasis- replace i by em and b by strong.
    boolean
    lower-literals- folds known attribute values to lower case.
    boolean
    make-clean - remove Microsoft cruft.
    boolean
    make-clean - remove presentational clutter.
    boolean
    numeric-entities- output entities other than the built-in HTML entities in the numeric rather than the named entity form.
    boolean
    only-errors - if true normal output is suppressed.
    output-encoding the character encoding used for output.
    int
    ParseErrors - the number of errors that occurred in the most recent parse operation.
    int
    ParseWarnings - the number of warnings that occurred in the most recent parse operation.
    boolean
    print-body-only- output BODY content only.
    boolean
    quiet - no 'Parsing X', guessed DTD or summary.
    boolean
    quote-ampersand- output naked ampersand as &.
    boolean
    quote-marks- output " marks as &quot;.
    boolean
    quote-nbsp- output non-breaking space as entity.
    boolean
    output-raw- avoid mapping values > 127 to entities.
    int
    repeated-attributes- keep first or last duplicate attribute.
    boolean
    replace-color- replace hex color attribute values with names.
    int
    show-errors- number of errors to put out.
    boolean
    show-warnings - show warnings? (errors are always shown).
    boolean
    SmartIndent - does text/block level content effect indentation.
    int
    indent-spaces- default indentation.
     
    int
    tab-size- tab size in chars.
    boolean
    tidy-mark- add meta element indicating tidied doc.
    boolean
    trim-empty-elements- trim empty elements.
    boolean
    uppercase-attributes - output attributes in upper case.
    boolean
    uppercase-tags - output tags in upper case.
    boolean
    word-2000- draconian cleaning for Word2000.
    boolean
    wrap-asp- wrap within ASP pseudo elements.
    boolean
    wrap-attributes- wrap within attribute values.
    boolean
    wrap-jste- wrap within JSTE pseudo elements.
    int
    wrap- default wrap margin.
    boolean
    wrap-php- wrap within PHP pseudo elements.
    boolean
    wrap-script-literals- wrap within JavaScript string literals.
    boolean
    wrap-sections- wrap within <![ ...
    boolean
    writeback - if true then output tidied markup.
    boolean
    output-xhtml - output extensible HTML.
    boolean
    output-xml - create output as XML.
    boolean
    add-xml-pi- add <?xml?> for XML docs.
    boolean
    assume-xml-procins This option specifies if Tidy should change the parsing of processing instructions to require ?> as the terminator rather than >.
    boolean
    add-xml-space- if set to yes adds xml:space attr as needed.
    boolean
    input-xml - treat input as XML.
    static void
    main(String[] argv)
    Command line interface to parser and pretty printer.
    protected int
    mainExec(String[] argv)
    Main method, but returns the return code as an int instead of calling System.exit(code).
    Reads from the given input and returns the root Node.
    Reads from the given input and returns the root Node.
    Reads from the given input and returns the root Node.
    parse(Reader in, Writer out)
    Reads from the given input and returns the root Node.
    Parses InputStream in and returns a DOM Document node.
     
    void
    Pretty-prints a DOM Document.
    void
    pprint(Node node, OutputStream out)
    Pretty-prints a DOM Node.
    void
    setAltText(String altText)
    alt-text- default text for alt attribute.
    void
    setAsciiChars(boolean asciiChars)
    ascii-chars- convert quotes and dashes to nearest ASCII char.
    void
    setBreakBeforeBR(boolean breakBeforeBR)
    break-before-br - output newline before <br>.
    void
    setBurstSlides(boolean burstSlides)
    split- create slides on each h2 element.
    void
    Sets the configuration from a configuration file.
    void
    Sets the configuration from a properties object.
    void
    setDocType(String doctype)
    doctype- user specified doctype.
    void
    setDropEmptyParas(boolean dropEmptyParas)
    drop-empty-paras- discard empty p elements.
    void
    setDropFontTags(boolean dropFontTags)
    drop-font-tags- discard presentation tags.
    void
    setDropProprietaryAttributes(boolean dropProprietaryAttributes)
    drop-proprietary-attributes- discard proprietary attributes.
    void
    setEmacs(boolean emacs)
    gnu-emacs- if true format error output for GNU Emacs.
    void
    setEncloseBlockText(boolean encloseBlockText)
    enclose-block-text- if true text in blocks is wrapped in <p>'s.
    void
    setEncloseText(boolean encloseText)
    enclose-text- if true text at body is wrapped in <p>'s.
    void
    setErrfile(String errfile)
    Errfile - file name to write errors to.
    void
     
    void
    setEscapeCdata(boolean escapeCdata)
    escape-cdata- replace CDATA sections with escaped text.
    void
    setFixBackslash(boolean fixBackslash)
    fix-backslash- fix URLs by replacing \ with /.
    void
    setFixComments(boolean fixComments)
    fix-bad-comments- fix comments with adjacent hyphens.
    void
    setFixUri(boolean fixUri)
    fix-uri- fix uri references applying URI encoding if necessary.
    void
    setForceOutput(boolean forceOutput)
    force-output- output document even if errors were found.
    void
    setHideComments(boolean hideComments)
    hide-comments- hides all (real) comments in output.
    void
    setHideEndTags(boolean hideEndTags)
    hide-endtags - suppress optional end tags.
    void
    setIndentAttributes(boolean indentAttributes)
    indent-attributes- newline+indent before each attribute.
    void
    setIndentCdata(boolean indentCdata)
    indent-cdata- indent CDATA sections.
    void
    setIndentContent(boolean indentContent)
    indent - indent content of appropriate tags.
    void
    input-encoding the character encoding used for input.
    void
    InputStreamName - the name of the input stream (printed in the header information).
    void
    setJoinClasses(boolean joinClasses)
    join-classes- join multiple class attributes.
    void
    setJoinStyles(boolean joinStyles)
    join-styles- join multiple style attributes.
    void
    setKeepFileTimes(boolean keepFileTimes)
    keep-time- if true last modified time is preserved.
    void
    setLiteralAttribs(boolean literalAttribs)
    literal-attributes- if true attributes may use newlines.
    void
    setLogicalEmphasis(boolean logicalEmphasis)
    logical-emphasis- replace i by em and b by strong.
    void
    setLowerLiterals(boolean lowerLiterals)
    lower-literals- folds known attribute values to lower case.
    void
    setMakeBare(boolean makeBare)
    make-bare - remove Microsoft cruft.
    void
    setMakeClean(boolean makeClean)
    make-clean - remove presentational clutter.
    void
    Attach a TidyMessageListener which will be notified for messages and errors.
    void
    setNumEntities(boolean numEntities)
    numeric-entities- output entities other than the built-in HTML entities in the numeric rather than the named entity form.
    void
    setOnlyErrors(boolean onlyErrors)
    only-errors - if true normal output is suppressed.
    void
    output-encoding the character encoding used for output.
    void
    setPrintBodyOnly(boolean bodyOnly)
    print-body-only- output BODY content only.
    void
    setQuiet(boolean quiet)
    quiet - no 'Parsing X', guessed DTD or summary.
    void
    setQuoteAmpersand(boolean quoteAmpersand)
    quote-ampersand- output naked ampersand as &.
    void
    setQuoteMarks(boolean quoteMarks)
    quote-marks- output " marks as &quot;.
    void
    setQuoteNbsp(boolean quoteNbsp)
    quote-nbsp- output non-breaking space as entity.
    void
    setRawOut(boolean rawOut)
    output-raw- avoid mapping values > 127 to entities.
    void
    setRepeatedAttributes(int repeatedAttributes)
    repeated-attributes- keep first or last duplicate attribute.
    void
    setReplaceColor(boolean replaceColor)
    replace-color- replace hex color attribute values with names.
    void
    setShowErrors(int showErrors)
    show-errors- set the number of errors to put out.
    void
    setShowWarnings(boolean showWarnings)
    show-warnings - show warnings? (errors are always shown).
    void
    setSmartIndent(boolean smartIndent)
    SmartIndent - does text/block level content effect indentation.
    void
    setSpaces(int spaces)
    indent-spaces- default indentation.
    void
    setTabsize(int tabsize)
    tab-size- tab size in chars.
    void
    setTidyMark(boolean tidyMark)
    tidy-mark- add meta element indicating tidied doc.
    void
    setTrimEmptyElements(boolean trimEmpty)
    trim-empty-elements- trim empty elements.
    void
    setUpperCaseAttrs(boolean upperCaseAttrs)
    uppercase-attributes - output attributes in upper case.
    void
    setUpperCaseTags(boolean upperCaseTags)
    uppercase-tags - output tags in upper case.
    void
    setWord2000(boolean word2000)
    word-2000- draconian cleaning for Word2000.
    void
    setWrapAsp(boolean wrapAsp)
    wrap-asp- wrap within ASP pseudo elements.
    void
    setWrapAttVals(boolean wrapAttVals)
    wrap-attributes- wrap within attribute values.
    void
    setWrapJste(boolean wrapJste)
    wrap-jste- wrap within JSTE pseudo elements.
    void
    setWraplen(int wraplen)
    wrap- default wrap margin.
    void
    setWrapPhp(boolean wrapPhp)
    wrap-php- wrap within PHP pseudo elements.
    void
    setWrapScriptlets(boolean wrapScriptlets)
    wrap-script-literals- wrap within JavaScript string literals.
    void
    setWrapSection(boolean wrapSection)
    wrap-sections- wrap within <![ ...
    void
    setWriteback(boolean writeback)
    writeback - if true then output tidied markup.
    void
    setXHTML(boolean xhtml)
    output-xhtml - output extensible HTML.
    void
    setXmlOut(boolean xmlOut)
    output-xml - create output as XML.
    void
    setXmlPi(boolean xmlPi)
    add-xml-pi- add <?xml?> for XML docs.
    void
    setXmlPIs(boolean xmlPIs)
    assume-xml-procins This option specifies if Tidy should change the parsing of processing instructions to require ?> as the terminator rather than >.
    void
    setXmlSpace(boolean xmlSpace)
    add-xml-space- if set to yes adds xml:space attr as needed.
    void
    setXmlTags(boolean xmlTags)
    input-xml - treat input as XML.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • Tidy

      public Tidy()
      Instantiates a new Tidy instance. It's reccomended that a new instance is used at each parsing.
  • Method Details

    • getConfiguration

      public Configuration getConfiguration()
      Returns the actual configuration
      Returns:
      tidy configuration
    • getStderr

      public PrintWriter getStderr()
    • getParseErrors

      public int getParseErrors()
      ParseErrors - the number of errors that occurred in the most recent parse operation.
      Returns:
      number of errors that occurred in the most recent parse operation.
    • getParseWarnings

      public int getParseWarnings()
      ParseWarnings - the number of warnings that occurred in the most recent parse operation.
      Returns:
      number of warnings that occurred in the most recent parse operation.
    • setInputStreamName

      public void setInputStreamName(String name)
      InputStreamName - the name of the input stream (printed in the header information).
      Parameters:
      name - input stream name
    • getInputStreamName

      public String getInputStreamName()
    • getErrout

      public PrintWriter getErrout()
      Errout - the error output stream.
      Returns:
      error output stream.
    • setErrout

      public void setErrout(PrintWriter out)
    • setConfigurationFromFile

      public void setConfigurationFromFile(String filename)
      Sets the configuration from a configuration file.
      Parameters:
      filename - configuration file name/path.
    • setConfigurationFromProps

      public void setConfigurationFromProps(Properties props)
      Sets the configuration from a properties object.
      Parameters:
      props - Properties object
    • createEmptyDocument

      public static Document createEmptyDocument()
      Creates an empty DOM Document.
      Returns:
      a new org.w3c.dom.Document
    • parse

      public Node parse(InputStream in, OutputStream out)
      Reads from the given input and returns the root Node. If out is non-null, pretty prints to out. Warning: caller is responsible for calling close() on input and output after calling this method.
      Parameters:
      in - input
      out - optional destination for pretty-printed document
      Returns:
      parsed org.w3c.tidy.Node
    • parse

      public Node parse(Reader in, OutputStream out)
      Reads from the given input and returns the root Node. If out is non-null, pretty prints to out. Warning: caller is responsible for calling close() on input and output after calling this method.
      Parameters:
      in - input
      out - optional destination for pretty-printed document
      Returns:
      parsed org.w3c.tidy.Node
    • parse

      public Node parse(Reader in, Writer out)
      Reads from the given input and returns the root Node. If out is non-null, pretty prints to out. Warning: caller is responsible for calling close() on input and output after calling this method.
      Parameters:
      in - input
      out - optional destination for pretty-printed document
      Returns:
      parsed org.w3c.tidy.Node
    • parse

      public Node parse(InputStream in, Writer out)
      Reads from the given input and returns the root Node. If out is non-null, pretty prints to out. Warning: caller is responsible for calling close() on input and output after calling this method.
      Parameters:
      in - input
      out - optional destination for pretty-printed document
      Returns:
      parsed org.w3c.tidy.Node
    • parseDOM

      public Document parseDOM(InputStream in, OutputStream out)
      Parses InputStream in and returns a DOM Document node. If out is non-null, pretty prints to OutputStream out.
      Parameters:
      in - input stream
      out - optional output stream
      Returns:
      parsed org.w3c.dom.Document
    • parseDOM

      public Document parseDOM(Reader in, Writer out)
    • pprint

      public void pprint(Document doc, OutputStream out)
      Pretty-prints a DOM Document. Must be an instance of org.w3c.tidy.DOMDocumentImpl. Caller is responsible for closing the outputStream after calling this method.
      Parameters:
      doc - org.w3c.dom.Document
      out - output stream
    • pprint

      public void pprint(Node node, OutputStream out)
      Pretty-prints a DOM Node. Caller is responsible for closing the outputStream after calling this method.
      Parameters:
      node - org.w3c.dom.Node. Must be an instance of org.w3c.tidy.DOMNodeImpl.
      out - output stream
    • main

      public static void main(String[] argv)
      Command line interface to parser and pretty printer.
      Parameters:
      argv - command line parameters
    • mainExec

      protected int mainExec(String[] argv)
      Main method, but returns the return code as an int instead of calling System.exit(code). Needed for testing main method without shutting down tests.
      Parameters:
      argv - command line parameters
      Returns:
      return code
    • setMessageListener

      public void setMessageListener(TidyMessageListener listener)
      Attach a TidyMessageListener which will be notified for messages and errors.
      Parameters:
      listener - TidyMessageListener implementation
    • setSpaces

      public void setSpaces(int spaces)
      indent-spaces- default indentation.
      Parameters:
      spaces - number of spaces used for indentation
      See Also:
    • getSpaces

      public int getSpaces()
      indent-spaces- default indentation.
      Returns:
      number of spaces used for indentation
      See Also:
    • setWraplen

      public void setWraplen(int wraplen)
      wrap- default wrap margin.
      Parameters:
      wraplen - default wrap margin
      See Also:
    • getWraplen

      public int getWraplen()
      wrap- default wrap margin.
      Returns:
      default wrap margin
      See Also:
    • setTabsize

      public void setTabsize(int tabsize)
      tab-size- tab size in chars.
      Parameters:
      tabsize - tab size in chars
      See Also:
    • getTabsize

      public int getTabsize()
      tab-size- tab size in chars.
      Returns:
      tab size in chars
      See Also:
    • setErrfile

      public void setErrfile(String errfile)
      Errfile - file name to write errors to.
      Parameters:
      errfile - file name to write errors to
      See Also:
    • getErrfile

      public String getErrfile()
      Errfile - file name to write errors to.
      Returns:
      error file name
      See Also:
    • setWriteback

      public void setWriteback(boolean writeback)
      writeback - if true then output tidied markup. NOTE: this property is ignored when parsing from an InputStream.
      Parameters:
      writeback - true= output tidied markup
      See Also:
    • getWriteback

      public boolean getWriteback()
      writeback - if true then output tidied markup. NOTE: this property is ignored when parsing from an InputStream.
      Returns:
      true if tidy will output tidied markup in input file
      See Also:
    • setOnlyErrors

      public void setOnlyErrors(boolean onlyErrors)
      only-errors - if true normal output is suppressed.
      Parameters:
      onlyErrors - if true normal output is suppressed.
      See Also:
    • getOnlyErrors

      public boolean getOnlyErrors()
      only-errors - if true normal output is suppressed.
      Returns:
      true if normal output is suppressed.
      See Also:
    • setShowWarnings

      public void setShowWarnings(boolean showWarnings)
      show-warnings - show warnings? (errors are always shown).
      Parameters:
      showWarnings - if false warnings are not shown
      See Also:
    • getShowWarnings

      public boolean getShowWarnings()
      show-warnings - show warnings? (errors are always shown).
      Returns:
      false if warnings are not shown
      See Also:
    • setQuiet

      public void setQuiet(boolean quiet)
      quiet - no 'Parsing X', guessed DTD or summary.
      Parameters:
      quiet - true= don't output summary, warnings or errors
      See Also:
    • getQuiet

      public boolean getQuiet()
      quiet - no 'Parsing X', guessed DTD or summary.
      Returns:
      true if tidy will not output summary, warnings or errors
      See Also:
    • setIndentContent

      public void setIndentContent(boolean indentContent)
      indent - indent content of appropriate tags.
      Parameters:
      indentContent - indent content of appropriate tags
      See Also:
    • getIndentContent

      public boolean getIndentContent()
      indent - indent content of appropriate tags.
      Returns:
      true if tidy will indent content of appropriate tags
      See Also:
    • setSmartIndent

      public void setSmartIndent(boolean smartIndent)
      SmartIndent - does text/block level content effect indentation.
      Parameters:
      smartIndent - true if text/block level content should effect indentation
      See Also:
    • getSmartIndent

      public boolean getSmartIndent()
      SmartIndent - does text/block level content effect indentation.
      Returns:
      true if text/block level content should effect indentation
      See Also:
    • setHideEndTags

      public void setHideEndTags(boolean hideEndTags)
      hide-endtags - suppress optional end tags.
      Parameters:
      hideEndTags - true= suppress optional end tags
      See Also:
    • getHideEndTags

      public boolean getHideEndTags()
      hide-endtags - suppress optional end tags.
      Returns:
      true if tidy will suppress optional end tags
      See Also:
    • setXmlTags

      public void setXmlTags(boolean xmlTags)
      input-xml - treat input as XML.
      Parameters:
      xmlTags - true if tidy should treat input as XML
      See Also:
    • getXmlTags

      public boolean getXmlTags()
      input-xml - treat input as XML.
      Returns:
      true if tidy will treat input as XML
      See Also:
    • setXmlOut

      public void setXmlOut(boolean xmlOut)
      output-xml - create output as XML.
      Parameters:
      xmlOut - true if tidy should create output as xml
      See Also:
    • getXmlOut

      public boolean getXmlOut()
      output-xml - create output as XML.
      Returns:
      true if tidy will create output as xml
      See Also:
    • setXHTML

      public void setXHTML(boolean xhtml)
      output-xhtml - output extensible HTML.
      Parameters:
      xhtml - true if tidy should output XHTML
      See Also:
    • getXHTML

      public boolean getXHTML()
      output-xhtml - output extensible HTML.
      Returns:
      true if tidy will output XHTML
      See Also:
    • setUpperCaseTags

      public void setUpperCaseTags(boolean upperCaseTags)
      uppercase-tags - output tags in upper case.
      Parameters:
      upperCaseTags - true if tidy should output tags in upper case (default is lowercase)
      See Also:
    • getUpperCaseTags

      public boolean getUpperCaseTags()
      uppercase-tags - output tags in upper case.
      Returns:
      true if tidy should will tags in upper case
      See Also:
    • setUpperCaseAttrs

      public void setUpperCaseAttrs(boolean upperCaseAttrs)
      uppercase-attributes - output attributes in upper case.
      Parameters:
      upperCaseAttrs - true if tidy should output attributes in upper case (default is lowercase)
      See Also:
    • getUpperCaseAttrs

      public boolean getUpperCaseAttrs()
      uppercase-attributes - output attributes in upper case.
      Returns:
      true if tidy should will attributes in upper case
      See Also:
    • setMakeClean

      public void setMakeClean(boolean makeClean)
      make-clean - remove presentational clutter.
      Parameters:
      makeClean - true to remove presentational clutter
      See Also:
    • getMakeClean

      public boolean getMakeClean()
      make-clean - remove presentational clutter.
      Returns:
      true if tidy will remove presentational clutter
      See Also:
    • setMakeBare

      public void setMakeBare(boolean makeBare)
      make-bare - remove Microsoft cruft.
      Parameters:
      makeBare - true to remove Microsoft cruft
      See Also:
    • getMakeBare

      public boolean getMakeBare()
      make-clean - remove Microsoft cruft.
      Returns:
      true if tidy will remove Microsoft cruft
      See Also:
    • setBreakBeforeBR

      public void setBreakBeforeBR(boolean breakBeforeBR)
      break-before-br - output newline before <br>.
      Parameters:
      breakBeforeBR - true if tidy should output a newline before <br>
      See Also:
    • getBreakBeforeBR

      public boolean getBreakBeforeBR()
      break-before-br - output newline before <br>.
      Returns:
      true if tidy will output a newline before <br>
      See Also:
    • setBurstSlides

      public void setBurstSlides(boolean burstSlides)
      split- create slides on each h2 element.
      Parameters:
      burstSlides - true if tidy should create slides on each h2 element
      See Also:
    • getBurstSlides

      public boolean getBurstSlides()
      split- create slides on each h2 element.
      Returns:
      true if tidy will create slides on each h2 element
      See Also:
    • setNumEntities

      public void setNumEntities(boolean numEntities)
      numeric-entities- output entities other than the built-in HTML entities in the numeric rather than the named entity form.
      Parameters:
      numEntities - true if tidy should output entities in the numeric form.
      See Also:
    • getNumEntities

      public boolean getNumEntities()
      numeric-entities- output entities other than the built-in HTML entities in the numeric rather than the named entity form.
      Returns:
      true if tidy will output entities in the numeric form.
      See Also:
    • setQuoteMarks

      public void setQuoteMarks(boolean quoteMarks)
      quote-marks- output " marks as &quot;.
      Parameters:
      quoteMarks - true if tidy should output " marks as &quot;
      See Also:
    • getQuoteMarks

      public boolean getQuoteMarks()
      quote-marks- output " marks as &quot;.
      Returns:
      true if tidy will output " marks as &quot;
      See Also:
    • setQuoteNbsp

      public void setQuoteNbsp(boolean quoteNbsp)
      quote-nbsp- output non-breaking space as entity.
      Parameters:
      quoteNbsp - true if tidy should output non-breaking space as entity
      See Also:
    • getQuoteNbsp

      public boolean getQuoteNbsp()
      quote-nbsp- output non-breaking space as entity.
      Returns:
      true if tidy will output non-breaking space as entity
      See Also:
    • setQuoteAmpersand

      public void setQuoteAmpersand(boolean quoteAmpersand)
      quote-ampersand- output naked ampersand as &.
      Parameters:
      quoteAmpersand - true if tidy should output naked ampersand as &
      See Also:
    • getQuoteAmpersand

      public boolean getQuoteAmpersand()
      quote-ampersand- output naked ampersand as &.
      Returns:
      true if tidy will output naked ampersand as &
      See Also:
    • setWrapAttVals

      public void setWrapAttVals(boolean wrapAttVals)
      wrap-attributes- wrap within attribute values.
      Parameters:
      wrapAttVals - true if tidy should wrap within attribute values
      See Also:
    • getWrapAttVals

      public boolean getWrapAttVals()
      wrap-attributes- wrap within attribute values.
      Returns:
      true if tidy will wrap within attribute values
      See Also:
    • setWrapScriptlets

      public void setWrapScriptlets(boolean wrapScriptlets)
      wrap-script-literals- wrap within JavaScript string literals.
      Parameters:
      wrapScriptlets - true if tidy should wrap within JavaScript string literals
      See Also:
    • getWrapScriptlets

      public boolean getWrapScriptlets()
      wrap-script-literals- wrap within JavaScript string literals.
      Returns:
      true if tidy will wrap within JavaScript string literals
      See Also:
    • setWrapSection

      public void setWrapSection(boolean wrapSection)
      wrap-sections- wrap within <![ ... ]> section tags
      Parameters:
      wrapSection - true if tidy should wrap within <![ ... ]> section tags
      See Also:
    • getWrapSection

      public boolean getWrapSection()
      wrap-sections- wrap within <![ ... ]> section tags
      Returns:
      true if tidy will wrap within <![ ... ]> section tags
      See Also:
    • setAltText

      public void setAltText(String altText)
      alt-text- default text for alt attribute.
      Parameters:
      altText - default text for alt attribute
      See Also:
    • getAltText

      public String getAltText()
      alt-text- default text for alt attribute.
      Returns:
      default text for alt attribute
      See Also:
    • setXmlPi

      public void setXmlPi(boolean xmlPi)
      add-xml-pi- add <?xml?> for XML docs.
      Parameters:
      xmlPi - true if tidy should add <?xml?> for XML docs
      See Also:
    • getXmlPi

      public boolean getXmlPi()
      add-xml-pi- add <?xml?> for XML docs.
      Returns:
      true if tidy will add <?xml?> for XML docs
      See Also:
    • setDropFontTags

      public void setDropFontTags(boolean dropFontTags)
      drop-font-tags- discard presentation tags.
      Parameters:
      dropFontTags - true if tidy should discard presentation tags
      See Also:
    • getDropFontTags

      public boolean getDropFontTags()
      drop-font-tags- discard presentation tags.
      Returns:
      true if tidy will discard presentation tags
      See Also:
    • setDropProprietaryAttributes

      public void setDropProprietaryAttributes(boolean dropProprietaryAttributes)
      drop-proprietary-attributes- discard proprietary attributes.
      Parameters:
      dropProprietaryAttributes - true if tidy should discard proprietary attributes
      See Also:
    • getDropProprietaryAttributes

      public boolean getDropProprietaryAttributes()
      drop-proprietary-attributes- discard proprietary attributes.
      Returns:
      true if tidy will discard proprietary attributes
      See Also:
    • setDropEmptyParas

      public void setDropEmptyParas(boolean dropEmptyParas)
      drop-empty-paras- discard empty p elements.
      Parameters:
      dropEmptyParas - true if tidy should discard empty p elements
      See Also:
    • getDropEmptyParas

      public boolean getDropEmptyParas()
      drop-empty-paras- discard empty p elements.
      Returns:
      true if tidy will discard empty p elements
      See Also:
    • setFixComments

      public void setFixComments(boolean fixComments)
      fix-bad-comments- fix comments with adjacent hyphens.
      Parameters:
      fixComments - true if tidy should fix comments with adjacent hyphens
      See Also:
    • getFixComments

      public boolean getFixComments()
      fix-bad-comments- fix comments with adjacent hyphens.
      Returns:
      true if tidy will fix comments with adjacent hyphens
      See Also:
    • setWrapAsp

      public void setWrapAsp(boolean wrapAsp)
      wrap-asp- wrap within ASP pseudo elements.
      Parameters:
      wrapAsp - true if tidy should wrap within ASP pseudo elements
      See Also:
    • getWrapAsp

      public boolean getWrapAsp()
      wrap-asp- wrap within ASP pseudo elements.
      Returns:
      true if tidy will wrap within ASP pseudo elements
      See Also:
    • setWrapJste

      public void setWrapJste(boolean wrapJste)
      wrap-jste- wrap within JSTE pseudo elements.
      Parameters:
      wrapJste - true if tidy should wrap within JSTE pseudo elements
      See Also:
    • getWrapJste

      public boolean getWrapJste()
      wrap-jste- wrap within JSTE pseudo elements.
      Returns:
      true if tidy will wrap within JSTE pseudo elements
      See Also:
    • setWrapPhp

      public void setWrapPhp(boolean wrapPhp)
      wrap-php- wrap within PHP pseudo elements.
      Parameters:
      wrapPhp - true if tidy should wrap within PHP pseudo elements
      See Also:
    • getWrapPhp

      public boolean getWrapPhp()
      wrap-php- wrap within PHP pseudo elements.
      Returns:
      true if tidy will wrap within PHP pseudo elements
      See Also:
    • setFixBackslash

      public void setFixBackslash(boolean fixBackslash)
      fix-backslash- fix URLs by replacing \ with /.
      Parameters:
      fixBackslash - true if tidy should fix URLs by replacing \ with /
      See Also:
    • getFixBackslash

      public boolean getFixBackslash()
      fix-backslash- fix URLs by replacing \ with /.
      Returns:
      true if tidy will fix URLs by replacing \ with /
      See Also:
    • setIndentAttributes

      public void setIndentAttributes(boolean indentAttributes)
      indent-attributes- newline+indent before each attribute.
      Parameters:
      indentAttributes - true if tidy should output a newline+indent before each attribute
      See Also:
    • getIndentAttributes

      public boolean getIndentAttributes()
      indent-attributes- newline+indent before each attribute.
      Returns:
      true if tidy will output a newline+indent before each attribute
      See Also:
    • setDocType

      public void setDocType(String doctype)
      doctype- user specified doctype.
      Parameters:
      doctype - omit | auto | strict | loose | fpi where the fpi is a string similar to "-//ACME//DTD HTML 3.14159//EN" Note: for fpi include the double-quotes in the string.
      See Also:
    • getDocType

      public String getDocType()
      doctype- user specified doctype.
      Returns:
      omit | auto | strict | loose | fpi where the fpi is a string similar to "-//ACME//DTD HTML 3.14159//EN" Note: for fpi include the double-quotes in the string.
      See Also:
    • setLogicalEmphasis

      public void setLogicalEmphasis(boolean logicalEmphasis)
      logical-emphasis- replace i by em and b by strong.
      Parameters:
      logicalEmphasis - true if tidy should replace i by em and b by strong
      See Also:
    • getLogicalEmphasis

      public boolean getLogicalEmphasis()
      logical-emphasis- replace i by em and b by strong.
      Returns:
      true if tidy will replace i by em and b by strong
      See Also:
    • setXmlPIs

      public void setXmlPIs(boolean xmlPIs)
      assume-xml-procins This option specifies if Tidy should change the parsing of processing instructions to require ?> as the terminator rather than >. This option is automatically set if the input is in XML.
      Parameters:
      xmlPIs - true if tidy should expect a ?> at the end of processing instructions
      See Also:
    • getXmlPIs

      public boolean getXmlPIs()
      assume-xml-procins This option specifies if Tidy should change the parsing of processing instructions to require ?> as the terminator rather than >. This option is automatically set if the input is in XML.
      Returns:
      true if tidy will expect a ?> at the end of processing instructions
      See Also:
    • setEncloseText

      public void setEncloseText(boolean encloseText)
      enclose-text- if true text at body is wrapped in <p>'s.
      Parameters:
      encloseText - true if tidy should wrap text at body in <p>'s.
      See Also:
    • getEncloseText

      public boolean getEncloseText()
      enclose-text- if true text at body is wrapped in <p>'s.
      Returns:
      true if tidy will wrap text at body in <p>'s.
      See Also:
    • setEncloseBlockText

      public void setEncloseBlockText(boolean encloseBlockText)
      enclose-block-text- if true text in blocks is wrapped in <p>'s.
      Parameters:
      encloseBlockText - true if tidy should wrap text text in blocks in <p>'s.
      See Also:
    • getEncloseBlockText

      public boolean getEncloseBlockText()
      enclose-block-text- if true text in blocks is wrapped in <p>'s. return true if tidy should will text text in blocks in <p>'s.
      See Also:
    • setWord2000

      public void setWord2000(boolean word2000)
      word-2000- draconian cleaning for Word2000.
      Parameters:
      word2000 - true if tidy should clean word2000 documents
      See Also:
    • getWord2000

      public boolean getWord2000()
      word-2000- draconian cleaning for Word2000.
      Returns:
      true if tidy will clean word2000 documents
      See Also:
    • setTidyMark

      public void setTidyMark(boolean tidyMark)
      tidy-mark- add meta element indicating tidied doc.
      Parameters:
      tidyMark - true if tidy should add meta element indicating tidied doc
      See Also:
    • getTidyMark

      public boolean getTidyMark()
      tidy-mark- add meta element indicating tidied doc.
      Returns:
      true if tidy will add meta element indicating tidied doc
      See Also:
    • setXmlSpace

      public void setXmlSpace(boolean xmlSpace)
      add-xml-space- if set to yes adds xml:space attr as needed.
      Parameters:
      xmlSpace - true if tidy should add xml:space attr as needed
      See Also:
    • getXmlSpace

      public boolean getXmlSpace()
      add-xml-space- if set to yes adds xml:space attr as needed.
      Returns:
      true if tidy will add xml:space attr as needed
      See Also:
    • setEmacs

      public void setEmacs(boolean emacs)
      gnu-emacs- if true format error output for GNU Emacs.
      Parameters:
      emacs - true if tidy should format error output for GNU Emacs
      See Also:
    • getEmacs

      public boolean getEmacs()
      gnu-emacs- if true format error output for GNU Emacs.
      Returns:
      true if tidy will format error output for GNU Emacs
      See Also:
    • setLiteralAttribs

      public void setLiteralAttribs(boolean literalAttribs)
      literal-attributes- if true attributes may use newlines.
      Parameters:
      literalAttribs - true if attributes may use newlines
      See Also:
    • getLiteralAttribs

      public boolean getLiteralAttribs()
      literal-attributes- if true attributes may use newlines.
      Returns:
      true if attributes may use newlines
      See Also:
    • setPrintBodyOnly

      public void setPrintBodyOnly(boolean bodyOnly)
      print-body-only- output BODY content only.
      Parameters:
      bodyOnly - true = print only the document body
      See Also:
    • getPrintBodyOnly

      public boolean getPrintBodyOnly()
      print-body-only- output BODY content only.
      Returns:
      true if tidy will print only the document body
    • setFixUri

      public void setFixUri(boolean fixUri)
      fix-uri- fix uri references applying URI encoding if necessary.
      Parameters:
      fixUri - true = fix uri references
      See Also:
    • getFixUri

      public boolean getFixUri()
      fix-uri- output BODY content only.
      Returns:
      true if tidy will fix uri references
    • setLowerLiterals

      public void setLowerLiterals(boolean lowerLiterals)
      lower-literals- folds known attribute values to lower case.
      Parameters:
      lowerLiterals - true = folds known attribute values to lower case
      See Also:
    • getLowerLiterals

      public boolean getLowerLiterals()
      lower-literals- folds known attribute values to lower case.
      Returns:
      true if tidy will folds known attribute values to lower case
    • setHideComments

      public void setHideComments(boolean hideComments)
      hide-comments- hides all (real) comments in output.
      Parameters:
      hideComments - true = hides all comments in output
      See Also:
    • getHideComments

      public boolean getHideComments()
      hide-comments- hides all (real) comments in output.
      Returns:
      true if tidy will hide all comments in output
    • setIndentCdata

      public void setIndentCdata(boolean indentCdata)
      indent-cdata- indent CDATA sections.
      Parameters:
      indentCdata - true = indent CDATA sections
      See Also:
    • getIndentCdata

      public boolean getIndentCdata()
      indent-cdata- indent CDATA sections.
      Returns:
      true if tidy will indent CDATA sections
    • setForceOutput

      public void setForceOutput(boolean forceOutput)
      force-output- output document even if errors were found.
      Parameters:
      forceOutput - true = output document even if errors were found
      See Also:
    • getForceOutput

      public boolean getForceOutput()
      force-output- output document even if errors were found.
      Returns:
      true if tidy will output document even if errors were found
    • setShowErrors

      public void setShowErrors(int showErrors)
      show-errors- set the number of errors to put out.
      Parameters:
      showErrors - number of errors to put out
      See Also:
    • getShowErrors

      public int getShowErrors()
      show-errors- number of errors to put out.
      Returns:
      the number of errors tidy will put out
    • setAsciiChars

      public void setAsciiChars(boolean asciiChars)
      ascii-chars- convert quotes and dashes to nearest ASCII char.
      Parameters:
      asciiChars - true = convert quotes and dashes to nearest ASCII char
      See Also:
    • getAsciiChars

      public boolean getAsciiChars()
      ascii-chars- convert quotes and dashes to nearest ASCII char.
      Returns:
      true if tidy will convert quotes and dashes to nearest ASCII char
    • setJoinClasses

      public void setJoinClasses(boolean joinClasses)
      join-classes- join multiple class attributes.
      Parameters:
      joinClasses - true = join multiple class attributes
      See Also:
    • getJoinClasses

      public boolean getJoinClasses()
      join-classes- join multiple class attributes.
      Returns:
      true if tidy will join multiple class attributes
    • setJoinStyles

      public void setJoinStyles(boolean joinStyles)
      join-styles- join multiple style attributes.
      Parameters:
      joinStyles - true = join multiple style attributes
      See Also:
    • getJoinStyles

      public boolean getJoinStyles()
      join-styles- join multiple style attributes.
      Returns:
      true if tidy will join multiple style attributes
    • setTrimEmptyElements

      public void setTrimEmptyElements(boolean trimEmpty)
      trim-empty-elements- trim empty elements.
      Parameters:
      trim - -empty-elements true = trim empty elements
      See Also:
    • getTrimEmptyElements

      public boolean getTrimEmptyElements()
      trim-empty-elements- trim empty elements.
      Returns:
      true if tidy will trim empty elements
    • setReplaceColor

      public void setReplaceColor(boolean replaceColor)
      replace-color- replace hex color attribute values with names.
      Parameters:
      replaceColor - true = replace hex color attribute values with names
      See Also:
    • getReplaceColor

      public boolean getReplaceColor()
      replace-color- replace hex color attribute values with names.
      Returns:
      true if tidy will replace hex color attribute values with names
    • setEscapeCdata

      public void setEscapeCdata(boolean escapeCdata)
      escape-cdata- replace CDATA sections with escaped text.
      Parameters:
      escapeCdata - true = replace CDATA sections with escaped text
      See Also:
    • getEscapeCdata

      public boolean getEscapeCdata()
      escape-cdata -replace CDATA sections with escaped text.
      Returns:
      true if tidy will replace CDATA sections with escaped text
    • setRepeatedAttributes

      public void setRepeatedAttributes(int repeatedAttributes)
      repeated-attributes- keep first or last duplicate attribute.
      Parameters:
      repeatedAttributes - Configuration.KEEP_FIRST | Configuration.KEEP_LAST
      See Also:
    • getRepeatedAttributes

      public int getRepeatedAttributes()
      repeated-attributes- keep first or last duplicate attribute.
      Returns:
      Configuration.KEEP_FIRST | Configuration.KEEP_LAST
    • setKeepFileTimes

      public void setKeepFileTimes(boolean keepFileTimes)
      keep-time- if true last modified time is preserved.
      Parameters:
      keepFileTimes - true if tidy should preserved last modified time in input file.
      See Also:
      To do:
      this is NOT supported at this time.
    • getKeepFileTimes

      public boolean getKeepFileTimes()
      keep-time- if true last modified time is preserved.
      Returns:
      true if tidy will preserved last modified time in input file.
      See Also:
      To do:
      this is NOT supported at this time.
    • setRawOut

      public void setRawOut(boolean rawOut)
      output-raw- avoid mapping values > 127 to entities. This has the same effect of specifying a "raw" encoding in the original version of tidy.
      Parameters:
      rawOut - avoid mapping values > 127 to entities
      See Also:
    • getRawOut

      public boolean getRawOut()
      output-raw- avoid mapping values > 127 to entities.
      Returns:
      true if tidy will not map values > 127 to entities
      See Also:
    • setInputEncoding

      public void setInputEncoding(String encoding)
      input-encoding the character encoding used for input.
      Parameters:
      encoding - a valid java encoding name
    • getInputEncoding

      public String getInputEncoding()
      input-encoding the character encoding used for input.
      Returns:
      the java name of the encoding currently used for input
    • setOutputEncoding

      public void setOutputEncoding(String encoding)
      output-encoding the character encoding used for output.
      Parameters:
      encoding - a valid java encoding name
    • getOutputEncoding

      public String getOutputEncoding()
      output-encoding the character encoding used for output.
      Returns:
      the java name of the encoding currently used for output