Package org.w3c.tidy
Class Configuration
java.lang.Object
org.w3c.tidy.Configuration
- All Implemented Interfaces:
Serializable
Read configuration file and manage configuration properties. Configuration files associate a property name with a
value. The format is that of a Java .properties file.
- Version:
- $Revision: 817 $ ($Author: steffenyount $)
- Author:
- Dave Raggett dsr@w3.org , Andy Quick ac.quick@sympatico.ca (translation to Java), Fabrizio Giustina
- See Also:
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected String
default text for alt attribute.static final int
Deprecated.protected boolean
convert quotes and dashes to nearest ASCII char.static final int
Deprecated.protected boolean
output BODY content only.protected boolean
o/p newline before br or not?protected boolean
create slides on each h2 element.protected String
CSS class naming for -clean option.protected int
track what types of tags user has defined to eliminate unnecessary searches.static final int
treatment of doctype: auto.static final int
treatment of doctype: loose.static final int
treatment of doctype: omit.static final int
treatment of doctype: strict.static final int
treatment of doctype: user.protected int
see doctype property.protected String
user specified doctype.protected boolean
discard empty p elements.protected boolean
discard presentation tags.protected boolean
discard proprietary attributes.protected int
Keep first or last duplicate attribute.protected boolean
if true format error output for GNU Emacs.protected boolean
if yes text in blocks is wrapped in p's.protected boolean
if yes text at body is wrapped in p's.protected String
file name to write errors to.protected boolean
replace CDATA sections with escaped text.protected boolean
fix URLs by replacing \ with /.protected boolean
fix comments with adjacent hyphens.protected boolean
properly escape URLs.protected boolean
output document even if errors were found.protected boolean
hides all (real) comments in output.protected boolean
suppress optional end tags.protected boolean
output plain-old HTML, even for XHTML input.protected boolean
newline+indent before each attribute.protected boolean
indent CDATA sections.protected boolean
indent content of appropriate tags.static final int
Deprecated.protected boolean
join multiple class attributes.protected boolean
join multiple style attributes.static final int
Keep first duplicate attribute.static final int
Keep last duplicate attribute.protected boolean
if yes last modied time is preserved.protected String
RJ language property.static final int
Deprecated.protected boolean
if true attributes may use newlines.protected boolean
replace i by em and b by strong.protected boolean
folds known attribute values to lower case.static final int
Deprecated.protected boolean
Make bare HTML: remove Microsoft cruft.protected boolean
remove presentational clutter.protected boolean
allow numeric character references.protected char[]
bytes for the newline marker.protected boolean
use numeric entities.protected boolean
if true normal output is suppressed.protected boolean
no 'Parsing X', guessed DTD or summary.protected boolean
output naked ampersand as &.protected boolean
output " marks as ".protected boolean
output non-breaking space as entity.static final int
Deprecated.protected boolean
Avoid mapping values > 127 to entities.protected boolean
replace hex color attribute values with names.protected String
char encoding used when replacing illegal SGML chars, regardless of specified encoding.protected Report
Report instance.static final int
Deprecated.protected int
number of errors to put out.protected boolean
however errors are always shown.protected String
Deprecated.does nothingprotected boolean
does text/block level content effect indentation.protected int
default indentation.protected int
default tab size (8).protected boolean
add meta element indicating tidied doc.protected boolean
trim empty elements.protected TagTable
TagTable associated with this Configuration.protected boolean
output attributes in upper not lower case.protected boolean
output tags in upper not lower case.static final int
Deprecated.static final int
Deprecated.static final int
Deprecated.static final int
Deprecated.static final int
Deprecated.protected boolean
draconian cleaning for Word2000.protected boolean
wrap within ASP pseudo elements.protected boolean
wrap within attribute values.protected boolean
wrap within JSTE pseudo elements.protected int
default wrap margin (68).protected boolean
wrap within PHP pseudo elements.protected boolean
wrap within JavaScript string literals.protected boolean
wrap within CDATA section tags.protected boolean
if true then output tidied markup.protected boolean
output extensible HTML.protected boolean
create output as XML.protected boolean
add<?xml?>
for XML docs.protected boolean
If set to yes PIs must end with?>
.protected boolean
if set to yes adds xml:space attr as needed.protected boolean
treat input as XML. -
Constructor Summary
ConstructorsModifierConstructorDescriptionprotected
Configuration
(Report report) Instantiates a new Configuration. -
Method Summary
Modifier and TypeMethodDescriptionvoid
adds configuration Properties.void
adjust()
Ensure that config is self consistent.protected String
convertCharEncoding
(int code) Convert a char encoding from the deprecated tidy constant to a standard java encoding name.protected String
Getter forinCharEncodingName
.protected String
Getter foroutCharEncodingName
.static boolean
isKnownOption
(String name) Is the given String a valid configuration flag?void
Parses a property file.void
printConfigOptions
(Writer errout, boolean showActualConfiguration) prints available configuration options.protected void
setInCharEncoding
(int encoding) Deprecated.use setInCharEncodingName(String)protected void
setInCharEncodingName
(String encoding) Setter forinCharEncodingName
.protected void
setInOutEncodingName
(String encoding) Setter forinOutCharEncodingName
.protected void
setOutCharEncoding
(int encoding) Deprecated.use setOutCharEncodingName(String)protected void
setOutCharEncodingName
(String encoding) Setter foroutCharEncodingName
.
-
Field Details
-
RAW
public static final int RAWDeprecated.useTidy.setRawOut(true)
for raw outputcharacter encoding = RAW.- See Also:
-
ASCII
public static final int ASCIIDeprecated.character encoding = ASCII.- See Also:
-
LATIN1
public static final int LATIN1Deprecated.character encoding = LATIN1.- See Also:
-
UTF8
public static final int UTF8Deprecated.character encoding = UTF8.- See Also:
-
ISO2022
public static final int ISO2022Deprecated.character encoding = ISO2022.- See Also:
-
MACROMAN
public static final int MACROMANDeprecated.character encoding = MACROMAN.- See Also:
-
UTF16LE
public static final int UTF16LEDeprecated.character encoding = UTF16LE.- See Also:
-
UTF16BE
public static final int UTF16BEDeprecated.character encoding = UTF16BE.- See Also:
-
UTF16
public static final int UTF16Deprecated.character encoding = UTF16.- See Also:
-
WIN1252
public static final int WIN1252Deprecated.character encoding = WIN1252.- See Also:
-
BIG5
public static final int BIG5Deprecated.character encoding = BIG5.- See Also:
-
SHIFTJIS
public static final int SHIFTJISDeprecated.character encoding = SHIFTJIS.- See Also:
-
DOCTYPE_OMIT
public static final int DOCTYPE_OMITtreatment of doctype: omit.- See Also:
- To do:
- should be an enumeration DocTypeMode
-
DOCTYPE_AUTO
public static final int DOCTYPE_AUTOtreatment of doctype: auto.- See Also:
-
DOCTYPE_STRICT
public static final int DOCTYPE_STRICTtreatment of doctype: strict.- See Also:
-
DOCTYPE_LOOSE
public static final int DOCTYPE_LOOSEtreatment of doctype: loose.- See Also:
-
DOCTYPE_USER
public static final int DOCTYPE_USERtreatment of doctype: user.- See Also:
-
KEEP_LAST
public static final int KEEP_LASTKeep last duplicate attribute.- See Also:
- To do:
- should be an enumeration DupAttrMode
-
KEEP_FIRST
public static final int KEEP_FIRSTKeep first duplicate attribute.- See Also:
-
spaces
protected int spacesdefault indentation. -
wraplen
protected int wraplendefault wrap margin (68). -
tabsize
protected int tabsizedefault tab size (8). -
docTypeMode
protected int docTypeModesee doctype property. -
duplicateAttrs
protected int duplicateAttrsKeep first or last duplicate attribute. -
altText
default text for alt attribute. -
slidestyle
Deprecated.does nothingstyle sheet for slides. -
language
RJ language property. -
docTypeStr
user specified doctype. -
errfile
file name to write errors to. -
writeback
protected boolean writebackif true then output tidied markup. -
onlyErrors
protected boolean onlyErrorsif true normal output is suppressed. -
showWarnings
protected boolean showWarningshowever errors are always shown. -
quiet
protected boolean quietno 'Parsing X', guessed DTD or summary. -
indentContent
protected boolean indentContentindent content of appropriate tags. -
smartIndent
protected boolean smartIndentdoes text/block level content effect indentation. -
hideEndTags
protected boolean hideEndTagssuppress optional end tags. -
xmlTags
protected boolean xmlTagstreat input as XML. -
xmlOut
protected boolean xmlOutcreate output as XML. -
xHTML
protected boolean xHTMLoutput extensible HTML. -
htmlOut
protected boolean htmlOutoutput plain-old HTML, even for XHTML input. Yes means set explicitly. -
xmlPi
protected boolean xmlPiadd<?xml?>
for XML docs. -
upperCaseTags
protected boolean upperCaseTagsoutput tags in upper not lower case. -
upperCaseAttrs
protected boolean upperCaseAttrsoutput attributes in upper not lower case. -
makeClean
protected boolean makeCleanremove presentational clutter. -
makeBare
protected boolean makeBareMake bare HTML: remove Microsoft cruft. -
logicalEmphasis
protected boolean logicalEmphasisreplace i by em and b by strong. -
dropFontTags
protected boolean dropFontTagsdiscard presentation tags. -
dropProprietaryAttributes
protected boolean dropProprietaryAttributesdiscard proprietary attributes. -
dropEmptyParas
protected boolean dropEmptyParasdiscard empty p elements. -
fixComments
protected boolean fixCommentsfix comments with adjacent hyphens. -
trimEmpty
protected boolean trimEmptytrim empty elements. -
breakBeforeBR
protected boolean breakBeforeBRo/p newline before br or not? -
burstSlides
protected boolean burstSlidescreate slides on each h2 element. -
numEntities
protected boolean numEntitiesuse numeric entities. -
quoteMarks
protected boolean quoteMarksoutput " marks as ". -
quoteNbsp
protected boolean quoteNbspoutput non-breaking space as entity. -
quoteAmpersand
protected boolean quoteAmpersandoutput naked ampersand as &. -
wrapAttVals
protected boolean wrapAttValswrap within attribute values. -
wrapScriptlets
protected boolean wrapScriptletswrap within JavaScript string literals. -
wrapSection
protected boolean wrapSectionwrap within CDATA section tags. -
wrapAsp
protected boolean wrapAspwrap within ASP pseudo elements. -
wrapJste
protected boolean wrapJstewrap within JSTE pseudo elements. -
wrapPhp
protected boolean wrapPhpwrap within PHP pseudo elements. -
fixBackslash
protected boolean fixBackslashfix URLs by replacing \ with /. -
indentAttributes
protected boolean indentAttributesnewline+indent before each attribute. -
xmlPIs
protected boolean xmlPIsIf set to yes PIs must end with?>
. -
xmlSpace
protected boolean xmlSpaceif set to yes adds xml:space attr as needed. -
encloseBodyText
protected boolean encloseBodyTextif yes text at body is wrapped in p's. -
encloseBlockText
protected boolean encloseBlockTextif yes text in blocks is wrapped in p's. -
keepFileTimes
protected boolean keepFileTimesif yes last modied time is preserved. -
word2000
protected boolean word2000draconian cleaning for Word2000. -
tidyMark
protected boolean tidyMarkadd meta element indicating tidied doc. -
emacs
protected boolean emacsif true format error output for GNU Emacs. -
literalAttribs
protected boolean literalAttribsif true attributes may use newlines. -
bodyOnly
protected boolean bodyOnlyoutput BODY content only. -
fixUri
protected boolean fixUriproperly escape URLs. -
lowerLiterals
protected boolean lowerLiteralsfolds known attribute values to lower case. -
replaceColor
protected boolean replaceColorreplace hex color attribute values with names. -
hideComments
protected boolean hideCommentshides all (real) comments in output. -
indentCdata
protected boolean indentCdataindent CDATA sections. -
forceOutput
protected boolean forceOutputoutput document even if errors were found. -
showErrors
protected int showErrorsnumber of errors to put out. -
asciiChars
protected boolean asciiCharsconvert quotes and dashes to nearest ASCII char. -
joinClasses
protected boolean joinClassesjoin multiple class attributes. -
joinStyles
protected boolean joinStylesjoin multiple style attributes. -
escapeCdata
protected boolean escapeCdatareplace CDATA sections with escaped text. -
ncr
protected boolean ncrallow numeric character references. -
cssPrefix
CSS class naming for -clean option. -
replacementCharEncoding
char encoding used when replacing illegal SGML chars, regardless of specified encoding. -
tt
TagTable associated with this Configuration. -
report
Report instance. Used for messages. -
definedTags
protected int definedTagstrack what types of tags user has defined to eliminate unnecessary searches. -
newline
protected char[] newlinebytes for the newline marker. -
rawOut
protected boolean rawOutAvoid mapping values > 127 to entities.
-
-
Constructor Details
-
Configuration
Instantiates a new Configuration. This method should be called by Tidy only.- Parameters:
report
- Report instance
-
-
Method Details
-
addProps
adds configuration Properties.- Parameters:
p
- Properties
-
parseFile
Parses a property file.- Parameters:
filename
- file name
-
isKnownOption
Is the given String a valid configuration flag?- Parameters:
name
- configuration parameter name- Returns:
true
if the given String is a valid config option
-
adjust
public void adjust()Ensure that config is self consistent. -
printConfigOptions
prints available configuration options.- Parameters:
errout
- where to writeshowActualConfiguration
- print actual configuration values
-
getInCharEncodingName
Getter forinCharEncodingName
.- Returns:
- Returns the inCharEncodingName.
-
setInCharEncodingName
Setter forinCharEncodingName
.- Parameters:
encoding
- The inCharEncodingName to set.
-
getOutCharEncodingName
Getter foroutCharEncodingName
.- Returns:
- Returns the outCharEncodingName.
-
setOutCharEncodingName
Setter foroutCharEncodingName
.- Parameters:
encoding
- The outCharEncodingName to set.
-
setInOutEncodingName
Setter forinOutCharEncodingName
.- Parameters:
encoding
- The CharEncodingName to set.
-
setOutCharEncoding
protected void setOutCharEncoding(int encoding) Deprecated.use setOutCharEncodingName(String)Setter foroutCharEncoding
.- Parameters:
encoding
- The outCharEncoding to set.
-
setInCharEncoding
protected void setInCharEncoding(int encoding) Deprecated.use setInCharEncodingName(String)Setter forinCharEncoding
.- Parameters:
encoding
- The inCharEncoding to set.
-
convertCharEncoding
Convert a char encoding from the deprecated tidy constant to a standard java encoding name.- Parameters:
code
- encoding code- Returns:
- encoding name
-
Tidy.setRawOut(true)
for raw output