harvest是一个下载html网页得机器人

源代码在线查看: sample.cfg

软件大小: 7910 K
上传用户: pc1667pc1667
关键词: harvest html
下载地址: 免注册下载 普通下载 VIP

相关代码

				# Sample configuration file for WP2X for a mythical conversion job.				#				# This is a configuration file that might be used to convert a				# multi-chapter paper from WordPerfect into LaTeX.				#				# One way to approach writing a configuration file is to first use a null.cfg				# file that defines nothing, then run your WP file through it, making careful				# note of the warning messages.  Then study how those tokens are used in the				# WP file and write a configuration file based thereon.				#				# We will use article style.  See the description of HPg for why there is				# a =section at the end of the BEGIN string.								BEGIN="\\documentstyle{article}\n=section"				END="\\end{document}\n"								# How to make a one-line comment in LaTeX.				Comment="%\n%% "				comment="\n"								# Protect percent signs and other magic symbols.				'%'="\\%%"				'$'="\\$"				'#'="\\#"				'&'="\\&"				'^'="\\^{}"				'_'="\\_"				'~'="\\~{}"								# Now the actual code expansions.  These are taken straight from latex.cfg								HSpace="~"                   # Unbreakable space				HRt="%\n\n"                  # Hard return becomes a blank line				SRt="%\n"                    # Soft return is a newline				-="-"                        # Hyphens are hyphens				--="-%%\n"                   # Hyphen at the end of a line gets a %				=="{-}"                      # Nonbreaking hyphen				\-="\\-"                     # Discretionary hyphen				\--="\\-%%\n"                # Discretionary hyphen at the end of the line								Und="{\\em "                 # Underlining is in "emphasized"				und="\\/}"                   # with italic correction stuck in always.								# As part of the postprocessing, I'd probably want to remove spurious italic				# corrections, via				#				#    %s/\\\/},/},/g				#    %s/\\\/}./}./g				#				# The only time boldface is used is in the section headings, so I can make				# the codes expand to nothing, since LaTeX boldfaces the section headings				# automatically.				Bold=""				bold=""								# Since my paper never uses the equal-sign `=', I can use it as a special				# tag in the output file.				#				# The style used in this paper is that all section and subsection headings				# appear as centered lines, and centered lines are used nowhere else in				# the paper.  Sections start after a hard page break; subsections continue				# in the middle of a page.  So what we'll do is leave a tag here, then				# postprocess the output with some vi macros.								HPg="%\n\n=section"				Center="=center{"				center="}\n"								# Afterwards, the following vi commands will turn the centering commands				# into the proper \section or \subsection commands.				#				#   %s/^=section=center/\\section/				#   %s/^=center/\\subsection/				#								# The only time single spacing is needed is during quotations, so we'll				# use SS and DS as signals to enter and exit the quotation environment.				# This produces a spurious \end{quote} at the top of the document, which				# we'll delete as part of the postprocessing.  Consequently, I don't				# need DIndent, since the {quote} environment does that for me.								SS="%\n\\begin{quote}\n"				DS="%\n\\end{quote}\n"				DIndent=""				indent=""								# The only characters I use overlap to produce is a capital O with a slash				# through it.  So afterwards, a simple				#				#   %s/O=				#				# will turn all `O overprint /' into `{\O}'.				#				Over="=				over=""								# Though I do need some other accented characters for words like				# r\'egime, r\^ole, na\"{\i}ve, and co\"operation.				'\202'="\\'e"				'\223'="\\^o"				'\213'="\\\"{\\i}"				'\224'="\\\"o"								Fn="%\n\\begin{footnote}\n"     # begin footnote				fn="%\n\\end{footnote}\n"				FNote#=""                    # Note numbers are automatically generated.				ENote#=""                    # Note numbers are automatically generated.								# I'll ignore headers and footers and suppression, since LaTeX does				# that automatically.				# them in LaTeX.				Header="%\n%% header\n\\toks0={"				header="} %% delete these lines!\n"				Footer="%\n%% footer\n\\toks0={"				footer="} %% delete these lines!\n"				Supp=""				PN0=""				PN1=""				PN2=""				PN3=""				PN4=""				PN5=""				PN6=""				PN7=""				PN8=""							

相关资源