/contrib/sdk/sources/expat/COPYING |
---|
0,0 → 1,22 |
Copyright (c) 1998, 1999, 2000 Thai Open Source Software Center Ltd |
and Clark Cooper |
Copyright (c) 2001, 2002, 2003, 2004, 2005, 2006 Expat maintainers. |
Permission is hereby granted, free of charge, to any person obtaining |
a copy of this software and associated documentation files (the |
"Software"), to deal in the Software without restriction, including |
without limitation the rights to use, copy, modify, merge, publish, |
distribute, sublicense, and/or sell copies of the Software, and to |
permit persons to whom the Software is furnished to do so, subject to |
the following conditions: |
The above copyright notice and this permission notice shall be included |
in all copies or substantial portions of the Software. |
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, |
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF |
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. |
IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY |
CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, |
TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE |
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. |
/contrib/sdk/sources/expat/Changes |
---|
0,0 → 1,205 |
Release 2.1.0 Sat March 24 2012 |
- Bug Fixes: |
#1742315: Harmful XML_ParserCreateNS suggestion. |
#2895533: CVE-2012-1147 - Resource leak in readfilemap.c. |
#1785430: Expat build fails on linux-amd64 with gcc version>=4.1 -O3. |
#1983953, 2517952, 2517962, 2649838: |
Build modifications using autoreconf instead of buildconf.sh. |
#2815947, #2884086: OBJEXT and EXEEXT support while building. |
#1990430: CVE-2009-3720 - Parser crash with special UTF-8 sequences. |
#2517938: xmlwf should return non-zero exit status if not well-formed. |
#2517946: Wrong statement about XMLDecl in xmlwf.1 and xmlwf.sgml. |
#2855609: Dangling positionPtr after error. |
#2894085: CVE-2009-3560 - Buffer over-read and crash in big2_toUtf8(). |
#2958794: CVE-2012-1148 - Memory leak in poolGrow. |
#2990652: CMake support. |
#3010819: UNEXPECTED_STATE with a trailing "%" in entity value. |
#3206497: Unitialized memory returned from XML_Parse. |
#3287849: make check fails on mingw-w64. |
#3496608: CVE-2012-0876 - Hash DOS attack. |
- Patches: |
#1749198: pkg-config support. |
#3010222: Fix for bug #3010819. |
#3312568: CMake support. |
#3446384: Report byte offsets for attr names and values. |
- New Features / API changes: |
Added new API member XML_SetHashSalt() that allows setting an intial |
value (salt) for hash calculations. This is part of the fix for |
bug #3496608 to randomize hash parameters. |
When compiled with XML_ATTR_INFO defined, adds new API member |
XML_GetAttributeInfo() that allows retrieving the byte |
offsets for attribute names and values (patch #3446384). |
Added CMake build system. |
See bug #2990652 and patch #3312568. |
Added run-benchmark target to Makefile.in - relies on testdata module |
present in the same relative location as in the repository. |
Release 2.0.1 Tue June 5 2007 |
- Fixed bugs #1515266, #1515600: The character data handler's calling |
of XML_StopParser() was not handled properly; if the parser was |
stopped and the handler set to NULL, the parser would segfault. |
- Fixed bug #1690883: Expat failed on EBCDIC systems as it assumed |
some character constants to be ASCII encoded. |
- Minor cleanups of the test harness. |
- Fixed xmlwf bug #1513566: "out of memory" error on file size zero. |
- Fixed outline.c bug #1543233: missing a final XML_ParserFree() call. |
- Fixes and improvements for Windows platform: |
bugs #1409451, #1476160, #1548182, #1602769, #1717322. |
- Build fixes for various platforms: |
HP-UX, Tru64, Solaris 9: patch #1437840, bug #1196180. |
All Unix: #1554618 (refreshed config.sub/config.guess). |
#1490371, #1613457: support both, DESTDIR and INSTALL_ROOT, |
without relying on GNU-Make specific features. |
#1647805: Patched configure.in to work better with Intel compiler. |
- Fixes to Makefile.in to have make check work correctly: |
bugs #1408143, #1535603, #1536684. |
- Added Open Watcom support: patch #1523242. |
Release 2.0.0 Wed Jan 11 2006 |
- We no longer use the "check" library for C unit testing; we |
always use the (partial) internal implementation of the API. |
- Report XML_NS setting via XML_GetFeatureList(). |
- Fixed headers for use from C++. |
- XML_GetCurrentLineNumber() and XML_GetCurrentColumnNumber() |
now return unsigned integers. |
- Added XML_LARGE_SIZE switch to enable 64-bit integers for |
byte indexes and line/column numbers. |
- Updated to use libtool 1.5.22 (the most recent). |
- Added support for AmigaOS. |
- Some mostly minor bug fixes. SF issues include: #1006708, |
#1021776, #1023646, #1114960, #1156398, #1221160, #1271642. |
Release 1.95.8 Fri Jul 23 2004 |
- Major new feature: suspend/resume. Handlers can now request |
that a parse be suspended for later resumption or aborted |
altogether. See "Temporarily Stopping Parsing" in the |
documentation for more details. |
- Some mostly minor bug fixes, but compilation should no |
longer generate warnings on most platforms. SF issues |
include: #827319, #840173, #846309, #888329, #896188, #923913, |
#928113, #961698, #985192. |
Release 1.95.7 Mon Oct 20 2003 |
- Fixed enum XML_Status issue (reported on SourceForge many |
times), so compilers that are properly picky will be happy. |
- Introduced an XMLCALL macro to control the calling |
convention used by the Expat API; this macro should be used |
to annotate prototypes and definitions of callback |
implementations in code compiled with a calling convention |
other than the default convention for the host platform. |
- Improved ability to build without the configure-generated |
expat_config.h header. This is useful for applications |
which embed Expat rather than linking in the library. |
- Fixed a variety of bugs: see SF issues #458907, #609603, |
#676844, #679754, #692878, #692964, #695401, #699323, #699487, |
#820946. |
- Improved hash table lookups. |
- Added more regression tests and improved documentation. |
Release 1.95.6 Tue Jan 28 2003 |
- Added XML_FreeContentModel(). |
- Added XML_MemMalloc(), XML_MemRealloc(), XML_MemFree(). |
- Fixed a variety of bugs: see SF issues #615606, #616863, |
#618199, #653180, #673791. |
- Enhanced the regression test suite. |
- Man page improvements: includes SF issue #632146. |
Release 1.95.5 Fri Sep 6 2002 |
- Added XML_UseForeignDTD() for improved SAX2 support. |
- Added XML_GetFeatureList(). |
- Defined XML_Bool type and the values XML_TRUE and XML_FALSE. |
- Use an incomplete struct instead of a void* for the parser |
(may not retain). |
- Fixed UTF-8 decoding bug that caused legal UTF-8 to be rejected. |
- Finally fixed bug where default handler would report DTD |
events that were already handled by another handler. |
Initial patch contributed by Darryl Miles. |
- Removed unnecessary DllMain() function that caused static |
linking into a DLL to be difficult. |
- Added VC++ projects for building static libraries. |
- Reduced line-length for all source code and headers to be |
no longer than 80 characters, to help with AS/400 support. |
- Reduced memory copying during parsing (SF patch #600964). |
- Fixed a variety of bugs: see SF issues #580793, #434664, |
#483514, #580503, #581069, #584041, #584183, #584832, #585537, |
#596555, #596678, #598352, #598944, #599715, #600479, #600971. |
Release 1.95.4 Fri Jul 12 2002 |
- Added support for VMS, contributed by Craig Berry. See |
vms/README.vms for more information. |
- Added Mac OS (classic) support, with a makefile for MPW, |
contributed by Thomas Wegner and Daryle Walker. |
- Added Borland C++ Builder 5 / BCC 5.5 support, contributed |
by Patrick McConnell (SF patch #538032). |
- Fixed a variety of bugs: see SF issues #441449, #563184, |
#564342, #566334, #566901, #569461, #570263, #575168, #579196. |
- Made skippedEntityHandler conform to SAX2 (see source comment) |
- Re-implemented WFC: Entity Declared from XML 1.0 spec and |
added a new error "entity declared in parameter entity": |
see SF bug report #569461 and SF patch #578161 |
- Re-implemented section 5.1 from XML 1.0 spec: |
see SF bug report #570263 and SF patch #578161 |
Release 1.95.3 Mon Jun 3 2002 |
- Added a project to the MSVC workspace to create a wchar_t |
version of the library; the DLLs are named libexpatw.dll. |
- Changed the name of the Windows DLLs from expat.dll to |
libexpat.dll; this fixes SF bug #432456. |
- Added the XML_ParserReset() API function. |
- Fixed XML_SetReturnNSTriplet() to work for element names. |
- Made the XML_UNICODE builds usable (thanks, Karl!). |
- Allow xmlwf to read from standard input. |
- Install a man page for xmlwf on Unix systems. |
- Fixed many bugs; see SF bug reports #231864, #461380, #464837, |
#466885, #469226, #477667, #484419, #487840, #494749, #496505, |
#547350. Other bugs which we can't test as easily may also |
have been fixed, especially in the area of build support. |
Release 1.95.2 Fri Jul 27 2001 |
- More changes to make MSVC happy with the build; add a single |
workspace to support both the library and xmlwf application. |
- Added a Windows installer for Windows users; includes |
xmlwf.exe. |
- Added compile-time constants that can be used to determine the |
Expat version |
- Removed a lot of GNU-specific dependencies to aide portability |
among the various Unix flavors. |
- Fix the UTF-8 BOM bug. |
- Cleaned up warning messages for several compilers. |
- Added the -Wall, -Wstrict-prototypes options for GCC. |
Release 1.95.1 Sun Oct 22 15:11:36 EDT 2000 |
- Changes to get expat to build under Microsoft compiler |
- Removed all aborts and instead return an UNEXPECTED_STATE error. |
- Fixed a bug where a stray '%' in an entity value would cause an |
abort. |
- Defined XML_SetEndNamespaceDeclHandler. Thanks to Darryl Miles for |
finding this oversight. |
- Changed default patterns in lib/Makefile.in to fit non-GNU makes |
Thanks to robin@unrated.net for reporting and providing an |
account to test on. |
- The reference had the wrong label for XML_SetStartNamespaceDecl. |
Reported by an anonymous user. |
Release 1.95.0 Fri Sep 29 2000 |
- XML_ParserCreate_MM |
Allows you to set a memory management suite to replace the |
standard malloc,realloc, and free. |
- XML_SetReturnNSTriplet |
If you turn this feature on when namespace processing is in |
effect, then qualified, prefixed element and attribute names |
are returned as "uri|name|prefix" where '|' is whatever |
separator character is used in namespace processing. |
- Merged in features from perl-expat |
o XML_SetElementDeclHandler |
o XML_SetAttlistDeclHandler |
o XML_SetXmlDeclHandler |
o XML_SetEntityDeclHandler |
o StartDoctypeDeclHandler takes 3 additional parameters: |
sysid, pubid, has_internal_subset |
o Many paired handler setters (like XML_SetElementHandler) |
now have corresponding individual handler setters |
o XML_GetInputContext for getting the input context of |
the current parse position. |
- Added reference material |
- Packaged into a distribution that builds a sharable library |
/contrib/sdk/sources/expat/Makefile |
---|
0,0 → 1,40 |
CC=gcc |
LD= ld |
AR= ar |
LIBRARY= libexpat |
CFLAGS = -U_Win32 -U_WIN32 -U__MINGW32__ -c -O2 -fomit-frame-pointer |
INCLUDES= -I. -I../newlib/include |
DEFS = -DHAVE_EXPAT_CONFIG_H |
DEFINES= $(DEFS) |
SRCS = lib/xmlparse.c \ |
lib/xmlrole.c \ |
lib/xmltok.c \ |
lib/xmltok_impl.c \ |
lib/xmltok_ns.c \ |
$(NULL) |
OBJS = $(patsubst %.c, %.o, $(SRCS)) |
# targets |
all:$(LIBRARY).a |
$(LIBRARY).a: $(OBJS) Makefile |
ar cvrs $(LIBRARY).a $(OBJS) |
mv -f $(LIBRARY).a ../../lib |
%.o : %.c Makefile |
$(CC) $(CFLAGS) $(DEFINES) $(INCLUDES) -o $@ $< |
clean: |
-rm -f lib/*.o |
/contrib/sdk/sources/expat/README |
---|
0,0 → 1,139 |
Expat, Release 2.1.0 |
This is Expat, a C library for parsing XML, written by James Clark. |
Expat is a stream-oriented XML parser. This means that you register |
handlers with the parser before starting the parse. These handlers |
are called when the parser discovers the associated structures in the |
document being parsed. A start tag is an example of the kind of |
structures for which you may register handlers. |
Windows users should use the expat_win32bin package, which includes |
both precompiled libraries and executables, and source code for |
developers. |
Expat is free software. You may copy, distribute, and modify it under |
the terms of the License contained in the file COPYING distributed |
with this package. This license is the same as the MIT/X Consortium |
license. |
Versions of Expat that have an odd minor version (the middle number in |
the release above), are development releases and should be considered |
as beta software. Releases with even minor version numbers are |
intended to be production grade software. |
If you are building Expat from a check-out from the CVS repository, |
you need to run a script that generates the configure script using the |
GNU autoconf and libtool tools. To do this, you need to have |
autoconf 2.58 or newer. Run the script like this: |
./buildconf.sh |
Once this has been done, follow the same instructions as for building |
from a source distribution. |
To build Expat from a source distribution, you first run the |
configuration shell script in the top level distribution directory: |
./configure |
There are many options which you may provide to configure (which you |
can discover by running configure with the --help option). But the |
one of most interest is the one that sets the installation directory. |
By default, the configure script will set things up to install |
libexpat into /usr/local/lib, expat.h into /usr/local/include, and |
xmlwf into /usr/local/bin. If, for example, you'd prefer to install |
into /home/me/mystuff/lib, /home/me/mystuff/include, and |
/home/me/mystuff/bin, you can tell configure about that with: |
./configure --prefix=/home/me/mystuff |
Another interesting option is to enable 64-bit integer support for |
line and column numbers and the over-all byte index: |
./configure CPPFLAGS=-DXML_LARGE_SIZE |
However, such a modification would be a breaking change to the ABI |
and is therefore not recommended for general use - e.g. as part of |
a Linux distribution - but rather for builds with special requirements. |
After running the configure script, the "make" command will build |
things and "make install" will install things into their proper |
location. Have a look at the "Makefile" to learn about additional |
"make" options. Note that you need to have write permission into |
the directories into which things will be installed. |
If you are interested in building Expat to provide document |
information in UTF-16 encoding rather than the default UTF-8, follow |
these instructions (after having run "make distclean"): |
1. For UTF-16 output as unsigned short (and version/error |
strings as char), run: |
./configure CPPFLAGS=-DXML_UNICODE |
For UTF-16 output as wchar_t (incl. version/error strings), |
run: |
./configure CFLAGS="-g -O2 -fshort-wchar" \ |
CPPFLAGS=-DXML_UNICODE_WCHAR_T |
2. Edit the MakeFile, changing: |
LIBRARY = libexpat.la |
to: |
LIBRARY = libexpatw.la |
(Note the additional "w" in the library name.) |
3. Run "make buildlib" (which builds the library only). |
Or, to save step 2, run "make buildlib LIBRARY=libexpatw.la". |
4. Run "make installlib" (which installs the library only). |
Or, if step 2 was omitted, run "make installlib LIBRARY=libexpatw.la". |
Using DESTDIR or INSTALL_ROOT is enabled, with INSTALL_ROOT being the default |
value for DESTDIR, and the rest of the make file using only DESTDIR. |
It works as follows: |
$ make install DESTDIR=/path/to/image |
overrides the in-makefile set DESTDIR, while both |
$ INSTALL_ROOT=/path/to/image make install |
$ make install INSTALL_ROOT=/path/to/image |
use DESTDIR=$(INSTALL_ROOT), even if DESTDIR eventually is defined in the |
environment, because variable-setting priority is |
1) commandline |
2) in-makefile |
3) environment |
Note: This only applies to the Expat library itself, building UTF-16 versions |
of xmlwf and the tests is currently not supported. |
Note for Solaris users: The "ar" command is usually located in |
"/usr/ccs/bin", which is not in the default PATH. You will need to |
add this to your path for the "make" command, and probably also switch |
to GNU make (the "make" found in /usr/ccs/bin does not seem to work |
properly -- appearantly it does not understand .PHONY directives). If |
you're using ksh or bash, use this command to build: |
PATH=/usr/ccs/bin:$PATH make |
When using Expat with a project using autoconf for configuration, you |
can use the probing macro in conftools/expat.m4 to determine how to |
include Expat. See the comments at the top of that file for more |
information. |
A reference manual is available in the file doc/reference.html in this |
distribution. |
The homepage for this project is http://www.libexpat.org/. There |
are links there to connect you to the bug reports page. If you need |
to report a bug when you don't have access to a browser, you may also |
send a bug report by email to expat-bugs@mail.libexpat.org. |
Discussion related to the direction of future expat development takes |
place on expat-discuss@mail.libexpat.org. Archives of this list and |
other Expat-related lists may be found at: |
http://mail.libexpat.org/mailman/listinfo/ |
/contrib/sdk/sources/expat/doc/expat.png |
---|
Cannot display: file marked as a binary type. |
svn:mime-type = application/octet-stream |
Property changes: |
Added: svn:mime-type |
+application/octet-stream |
\ No newline at end of property |
/contrib/sdk/sources/expat/doc/reference.html |
---|
0,0 → 1,2390 |
<?xml version="1.0" encoding="iso-8859-1"?> |
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" |
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> |
<html> |
<head> |
<!-- Copyright 1999,2000 Clark Cooper <coopercc@netheaven.com> |
All rights reserved. |
This is free software. You may distribute or modify according to |
the terms of the MIT/X License --> |
<title>Expat XML Parser</title> |
<meta name="author" content="Clark Cooper, coopercc@netheaven.com" /> |
<meta http-equiv="Content-Style-Type" content="text/css" /> |
<link href="style.css" rel="stylesheet" type="text/css" /> |
</head> |
<body> |
<table cellspacing="0" cellpadding="0" width="100%"> |
<tr> |
<td class="corner"><img src="expat.png" alt="(Expat logo)" /></td> |
<td class="banner"><h1>The Expat XML Parser</h1></td> |
</tr> |
<tr> |
<td class="releaseno">Release 2.0.1</td> |
<td></td> |
</tr> |
</table> |
<div class="content"> |
<p>Expat is a library, written in C, for parsing XML documents. It's |
the underlying XML parser for the open source Mozilla project, Perl's |
<code>XML::Parser</code>, Python's <code>xml.parsers.expat</code>, and |
other open-source XML parsers.</p> |
<p>This library is the creation of James Clark, who's also given us |
groff (an nroff look-alike), Jade (an implemention of ISO's DSSSL |
stylesheet language for SGML), XP (a Java XML parser package), XT (a |
Java XSL engine). James was also the technical lead on the XML |
Working Group at W3C that produced the XML specification.</p> |
<p>This is free software, licensed under the <a |
href="../COPYING">MIT/X Consortium license</a>. You may download it |
from <a href="http://www.libexpat.org/">the Expat home page</a>. |
</p> |
<p>The bulk of this document was originally commissioned as an article |
by <a href="http://www.xml.com/">XML.com</a>. They graciously allowed |
Clark Cooper to retain copyright and to distribute it with Expat. |
This version has been substantially extended to include documentation |
on features which have been added since the original article was |
published, and additional information on using the original |
interface.</p> |
<hr /> |
<h2>Table of Contents</h2> |
<ul> |
<li><a href="#overview">Overview</a></li> |
<li><a href="#building">Building and Installing</a></li> |
<li><a href="#using">Using Expat</a></li> |
<li><a href="#reference">Reference</a> |
<ul> |
<li><a href="#creation">Parser Creation Functions</a> |
<ul> |
<li><a href="#XML_ParserCreate">XML_ParserCreate</a></li> |
<li><a href="#XML_ParserCreateNS">XML_ParserCreateNS</a></li> |
<li><a href="#XML_ParserCreate_MM">XML_ParserCreate_MM</a></li> |
<li><a href="#XML_ExternalEntityParserCreate">XML_ExternalEntityParserCreate</a></li> |
<li><a href="#XML_ParserFree">XML_ParserFree</a></li> |
<li><a href="#XML_ParserReset">XML_ParserReset</a></li> |
</ul> |
</li> |
<li><a href="#parsing">Parsing Functions</a> |
<ul> |
<li><a href="#XML_Parse">XML_Parse</a></li> |
<li><a href="#XML_ParseBuffer">XML_ParseBuffer</a></li> |
<li><a href="#XML_GetBuffer">XML_GetBuffer</a></li> |
<li><a href="#XML_StopParser">XML_StopParser</a></li> |
<li><a href="#XML_ResumeParser">XML_ResumeParser</a></li> |
<li><a href="#XML_GetParsingStatus">XML_GetParsingStatus</a></li> |
</ul> |
</li> |
<li><a href="#setting">Handler Setting Functions</a> |
<ul> |
<li><a href="#XML_SetStartElementHandler">XML_SetStartElementHandler</a></li> |
<li><a href="#XML_SetEndElementHandler">XML_SetEndElementHandler</a></li> |
<li><a href="#XML_SetElementHandler">XML_SetElementHandler</a></li> |
<li><a href="#XML_SetCharacterDataHandler">XML_SetCharacterDataHandler</a></li> |
<li><a href="#XML_SetProcessingInstructionHandler">XML_SetProcessingInstructionHandler</a></li> |
<li><a href="#XML_SetCommentHandler">XML_SetCommentHandler</a></li> |
<li><a href="#XML_SetStartCdataSectionHandler">XML_SetStartCdataSectionHandler</a></li> |
<li><a href="#XML_SetEndCdataSectionHandler">XML_SetEndCdataSectionHandler</a></li> |
<li><a href="#XML_SetCdataSectionHandler">XML_SetCdataSectionHandler</a></li> |
<li><a href="#XML_SetDefaultHandler">XML_SetDefaultHandler</a></li> |
<li><a href="#XML_SetDefaultHandlerExpand">XML_SetDefaultHandlerExpand</a></li> |
<li><a href="#XML_SetExternalEntityRefHandler">XML_SetExternalEntityRefHandler</a></li> |
<li><a href="#XML_SetExternalEntityRefHandlerArg">XML_SetExternalEntityRefHandlerArg</a></li> |
<li><a href="#XML_SetSkippedEntityHandler">XML_SetSkippedEntityHandler</a></li> |
<li><a href="#XML_SetUnknownEncodingHandler">XML_SetUnknownEncodingHandler</a></li> |
<li><a href="#XML_SetStartNamespaceDeclHandler">XML_SetStartNamespaceDeclHandler</a></li> |
<li><a href="#XML_SetEndNamespaceDeclHandler">XML_SetEndNamespaceDeclHandler</a></li> |
<li><a href="#XML_SetNamespaceDeclHandler">XML_SetNamespaceDeclHandler</a></li> |
<li><a href="#XML_SetXmlDeclHandler">XML_SetXmlDeclHandler</a></li> |
<li><a href="#XML_SetStartDoctypeDeclHandler">XML_SetStartDoctypeDeclHandler</a></li> |
<li><a href="#XML_SetEndDoctypeDeclHandler">XML_SetEndDoctypeDeclHandler</a></li> |
<li><a href="#XML_SetDoctypeDeclHandler">XML_SetDoctypeDeclHandler</a></li> |
<li><a href="#XML_SetElementDeclHandler">XML_SetElementDeclHandler</a></li> |
<li><a href="#XML_SetAttlistDeclHandler">XML_SetAttlistDeclHandler</a></li> |
<li><a href="#XML_SetEntityDeclHandler">XML_SetEntityDeclHandler</a></li> |
<li><a href="#XML_SetUnparsedEntityDeclHandler">XML_SetUnparsedEntityDeclHandler</a></li> |
<li><a href="#XML_SetNotationDeclHandler">XML_SetNotationDeclHandler</a></li> |
<li><a href="#XML_SetNotStandaloneHandler">XML_SetNotStandaloneHandler</a></li> |
</ul> |
</li> |
<li><a href="#position">Parse Position and Error Reporting Functions</a> |
<ul> |
<li><a href="#XML_GetErrorCode">XML_GetErrorCode</a></li> |
<li><a href="#XML_ErrorString">XML_ErrorString</a></li> |
<li><a href="#XML_GetCurrentByteIndex">XML_GetCurrentByteIndex</a></li> |
<li><a href="#XML_GetCurrentLineNumber">XML_GetCurrentLineNumber</a></li> |
<li><a href="#XML_GetCurrentColumnNumber">XML_GetCurrentColumnNumber</a></li> |
<li><a href="#XML_GetCurrentByteCount">XML_GetCurrentByteCount</a></li> |
<li><a href="#XML_GetInputContext">XML_GetInputContext</a></li> |
</ul> |
</li> |
<li><a href="#miscellaneous">Miscellaneous Functions</a> |
<ul> |
<li><a href="#XML_SetUserData">XML_SetUserData</a></li> |
<li><a href="#XML_GetUserData">XML_GetUserData</a></li> |
<li><a href="#XML_UseParserAsHandlerArg">XML_UseParserAsHandlerArg</a></li> |
<li><a href="#XML_SetBase">XML_SetBase</a></li> |
<li><a href="#XML_GetBase">XML_GetBase</a></li> |
<li><a href="#XML_GetSpecifiedAttributeCount">XML_GetSpecifiedAttributeCount</a></li> |
<li><a href="#XML_GetIdAttributeIndex">XML_GetIdAttributeIndex</a></li> |
<li><a href="#XML_GetAttributeInfo">XML_GetAttributeInfo</a></li> |
<li><a href="#XML_SetEncoding">XML_SetEncoding</a></li> |
<li><a href="#XML_SetParamEntityParsing">XML_SetParamEntityParsing</a></li> |
<li><a href="#XML_SetHashSalt">XML_SetHashSalt</a></li> |
<li><a href="#XML_UseForeignDTD">XML_UseForeignDTD</a></li> |
<li><a href="#XML_SetReturnNSTriplet">XML_SetReturnNSTriplet</a></li> |
<li><a href="#XML_DefaultCurrent">XML_DefaultCurrent</a></li> |
<li><a href="#XML_ExpatVersion">XML_ExpatVersion</a></li> |
<li><a href="#XML_ExpatVersionInfo">XML_ExpatVersionInfo</a></li> |
<li><a href="#XML_GetFeatureList">XML_GetFeatureList</a></li> |
<li><a href="#XML_FreeContentModel">XML_FreeContentModel</a></li> |
<li><a href="#XML_MemMalloc">XML_MemMalloc</a></li> |
<li><a href="#XML_MemRealloc">XML_MemRealloc</a></li> |
<li><a href="#XML_MemFree">XML_MemFree</a></li> |
</ul> |
</li> |
</ul> |
</li> |
</ul> |
<hr /> |
<h2><a name="overview">Overview</a></h2> |
<p>Expat is a stream-oriented parser. You register callback (or |
handler) functions with the parser and then start feeding it the |
document. As the parser recognizes parts of the document, it will |
call the appropriate handler for that part (if you've registered one.) |
The document is fed to the parser in pieces, so you can start parsing |
before you have all the document. This also allows you to parse really |
huge documents that won't fit into memory.</p> |
<p>Expat can be intimidating due to the many kinds of handlers and |
options you can set. But you only need to learn four functions in |
order to do 90% of what you'll want to do with it:</p> |
<dl> |
<dt><code><a href= "#XML_ParserCreate" |
>XML_ParserCreate</a></code></dt> |
<dd>Create a new parser object.</dd> |
<dt><code><a href= "#XML_SetElementHandler" |
>XML_SetElementHandler</a></code></dt> |
<dd>Set handlers for start and end tags.</dd> |
<dt><code><a href= "#XML_SetCharacterDataHandler" |
>XML_SetCharacterDataHandler</a></code></dt> |
<dd>Set handler for text.</dd> |
<dt><code><a href= "#XML_Parse" |
>XML_Parse</a></code></dt> |
<dd>Pass a buffer full of document to the parser</dd> |
</dl> |
<p>These functions and others are described in the <a |
href="#reference">reference</a> part of this document. The reference |
section also describes in detail the parameters passed to the |
different types of handlers.</p> |
<p>Let's look at a very simple example program that only uses 3 of the |
above functions (it doesn't need to set a character handler.) The |
program <a href="../examples/outline.c">outline.c</a> prints an |
element outline, indenting child elements to distinguish them from the |
parent element that contains them. The start handler does all the |
work. It prints two indenting spaces for every level of ancestor |
elements, then it prints the element and attribute |
information. Finally it increments the global <code>Depth</code> |
variable.</p> |
<pre class="eg"> |
int Depth; |
void XMLCALL |
start(void *data, const char *el, const char **attr) { |
int i; |
for (i = 0; i < Depth; i++) |
printf(" "); |
printf("%s", el); |
for (i = 0; attr[i]; i += 2) { |
printf(" %s='%s'", attr[i], attr[i + 1]); |
} |
printf("\n"); |
Depth++; |
} /* End of start handler */ |
</pre> |
<p>The end tag simply does the bookkeeping work of decrementing |
<code>Depth</code>.</p> |
<pre class="eg"> |
void XMLCALL |
end(void *data, const char *el) { |
Depth--; |
} /* End of end handler */ |
</pre> |
<p>Note the <code>XMLCALL</code> annotation used for the callbacks. |
This is used to ensure that the Expat and the callbacks are using the |
same calling convention in case the compiler options used for Expat |
itself and the client code are different. Expat tries not to care |
what the default calling convention is, though it may require that it |
be compiled with a default convention of "cdecl" on some platforms. |
For code which uses Expat, however, the calling convention is |
specified by the <code>XMLCALL</code> annotation on most platforms; |
callbacks should be defined using this annotation.</p> |
<p>The <code>XMLCALL</code> annotation was added in Expat 1.95.7, but |
existing working Expat applications don't need to add it (since they |
are already using the "cdecl" calling convention, or they wouldn't be |
working). The annotation is only needed if the default calling |
convention may be something other than "cdecl". To use the annotation |
safely with older versions of Expat, you can conditionally define it |
<em>after</em> including Expat's header file:</p> |
<pre class="eg"> |
#include <expat.h> |
#ifndef XMLCALL |
#if defined(_MSC_EXTENSIONS) && !defined(__BEOS__) && !defined(__CYGWIN__) |
#define XMLCALL __cdecl |
#elif defined(__GNUC__) |
#define XMLCALL __attribute__((cdecl)) |
#else |
#define XMLCALL |
#endif |
#endif |
</pre> |
<p>After creating the parser, the main program just has the job of |
shoveling the document to the parser so that it can do its work.</p> |
<hr /> |
<h2><a name="building">Building and Installing Expat</a></h2> |
<p>The Expat distribution comes as a compressed (with GNU gzip) tar |
file. You may download the latest version from <a href= |
"http://sourceforge.net/projects/expat/" >Source Forge</a>. After |
unpacking this, cd into the directory. Then follow either the Win32 |
directions or Unix directions below.</p> |
<h3>Building under Win32</h3> |
<p>If you're using the GNU compiler under cygwin, follow the Unix |
directions in the next section. Otherwise if you have Microsoft's |
Developer Studio installed, then from Windows Explorer double-click on |
"expat.dsp" in the lib directory and build and install in the usual |
manner.</p> |
<p>Alternatively, you may download the Win32 binary package that |
contains the "expat.h" include file and a pre-built DLL.</p> |
<h3>Building under Unix (or GNU)</h3> |
<p>First you'll need to run the configure shell script in order to |
configure the Makefiles and headers for your system.</p> |
<p>If you're happy with all the defaults that configure picks for you, |
and you have permission on your system to install into /usr/local, you |
can install Expat with this sequence of commands:</p> |
<pre class="eg"> |
./configure |
make |
make install |
</pre> |
<p>There are some options that you can provide to this script, but the |
only one we'll mention here is the <code>--prefix</code> option. You |
can find out all the options available by running configure with just |
the <code>--help</code> option.</p> |
<p>By default, the configure script sets things up so that the library |
gets installed in <code>/usr/local/lib</code> and the associated |
header file in <code>/usr/local/include</code>. But if you were to |
give the option, <code>--prefix=/home/me/mystuff</code>, then the |
library and header would get installed in |
<code>/home/me/mystuff/lib</code> and |
<code>/home/me/mystuff/include</code> respectively.</p> |
<h3>Configuring Expat Using the Pre-Processor</h3> |
<p>Expat's feature set can be configured using a small number of |
pre-processor definitions. The definition of this symbols does not |
affect the set of entry points for Expat, only the behavior of the API |
and the definition of character types in the case of |
<code>XML_UNICODE_WCHAR_T</code>. The symbols are:</p> |
<dl class="cpp-symbols"> |
<dt>XML_DTD</dt> |
<dd>Include support for using and reporting DTD-based content. If |
this is defined, default attribute values from an external DTD subset |
are reported and attribute value normalization occurs based on the |
type of attributes defined in the external subset. Without |
this, Expat has a smaller memory footprint and can be faster, but will |
not load external entities or process conditional sections. This does |
not affect the set of functions available in the API.</dd> |
<dt>XML_NS</dt> |
<dd>When defined, support for the <cite><a href= |
"http://www.w3.org/TR/REC-xml-names/" >Namespaces in XML</a></cite> |
specification is included.</dd> |
<dt>XML_UNICODE</dt> |
<dd>When defined, character data reported to the application is |
encoded in UTF-16 using wide characters of the type |
<code>XML_Char</code>. This is implied if |
<code>XML_UNICODE_WCHAR_T</code> is defined.</dd> |
<dt>XML_UNICODE_WCHAR_T</dt> |
<dd>If defined, causes the <code>XML_Char</code> character type to be |
defined using the <code>wchar_t</code> type; otherwise, <code>unsigned |
short</code> is used. Defining this implies |
<code>XML_UNICODE</code>.</dd> |
<dt>XML_LARGE_SIZE</dt> |
<dd>If defined, causes the <code>XML_Size</code> and <code>XML_Index</code> |
integer types to be at least 64 bits in size. This is intended to support |
processing of very large input streams, where the return values of |
<code><a href="#XML_GetCurrentByteIndex" >XML_GetCurrentByteIndex</a></code>, |
<code><a href="#XML_GetCurrentLineNumber" >XML_GetCurrentLineNumber</a></code> and |
<code><a href="#XML_GetCurrentColumnNumber" >XML_GetCurrentColumnNumber</a></code> |
could overflow. It may not be supported by all compilers, and is turned |
off by default.</dd> |
<dt>XML_CONTEXT_BYTES</dt> |
<dd>The number of input bytes of markup context which the parser will |
ensure are available for reporting via <code><a href= |
"#XML_GetInputContext" >XML_GetInputContext</a></code>. This is |
normally set to 1024, and must be set to a positive interger. If this |
is not defined, the input context will not be available and <code><a |
href= "#XML_GetInputContext" >XML_GetInputContext</a></code> will |
always report NULL. Without this, Expat has a smaller memory |
footprint and can be faster.</dd> |
<dt>XML_STATIC</dt> |
<dd>On Windows, this should be set if Expat is going to be linked |
statically with the code that calls it; this is required to get all |
the right MSVC magic annotations correct. This is ignored on other |
platforms.</dd> |
<dt>XML_ATTR_INFO</dt> |
<dd>If defined, makes the the additional function <code><a href= |
"#XML_GetAttributeInfo" >XML_GetAttributeInfo</a></code> available |
for reporting attribute byte offsets.</dd> |
</dl> |
<hr /> |
<h2><a name="using">Using Expat</a></h2> |
<h3>Compiling and Linking Against Expat</h3> |
<p>Unless you installed Expat in a location not expected by your |
compiler and linker, all you have to do to use Expat in your programs |
is to include the Expat header (<code>#include <expat.h></code>) |
in your files that make calls to it and to tell the linker that it |
needs to link against the Expat library. On Unix systems, this would |
usually be done with the <code>-lexpat</code> argument. Otherwise, |
you'll need to tell the compiler where to look for the Expat header |
and the linker where to find the Expat library. You may also need to |
take steps to tell the operating system where to find this library at |
run time.</p> |
<p>On a Unix-based system, here's what a Makefile might look like when |
Expat is installed in a standard location:</p> |
<pre class="eg"> |
CC=cc |
LDFLAGS= |
LIBS= -lexpat |
xmlapp: xmlapp.o |
$(CC) $(LDFLAGS) -o xmlapp xmlapp.o $(LIBS) |
</pre> |
<p>If you installed Expat in, say, <code>/home/me/mystuff</code>, then |
the Makefile would look like this:</p> |
<pre class="eg"> |
CC=cc |
CFLAGS= -I/home/me/mystuff/include |
LDFLAGS= |
LIBS= -L/home/me/mystuff/lib -lexpat |
xmlapp: xmlapp.o |
$(CC) $(LDFLAGS) -o xmlapp xmlapp.o $(LIBS) |
</pre> |
<p>You'd also have to set the environment variable |
<code>LD_LIBRARY_PATH</code> to <code>/home/me/mystuff/lib</code> (or |
to <code>${LD_LIBRARY_PATH}:/home/me/mystuff/lib</code> if |
LD_LIBRARY_PATH already has some directories in it) in order to run |
your application.</p> |
<h3>Expat Basics</h3> |
<p>As we saw in the example in the overview, the first step in parsing |
an XML document with Expat is to create a parser object. There are <a |
href="#creation">three functions</a> in the Expat API for creating a |
parser object. However, only two of these (<code><a href= |
"#XML_ParserCreate" >XML_ParserCreate</a></code> and <code><a href= |
"#XML_ParserCreateNS" >XML_ParserCreateNS</a></code>) can be used for |
constructing a parser for a top-level document. The object returned |
by these functions is an opaque pointer (i.e. "expat.h" declares it as |
void *) to data with further internal structure. In order to free the |
memory associated with this object you must call <code><a href= |
"#XML_ParserFree" >XML_ParserFree</a></code>. Note that if you have |
provided any <a href="#userdata">user data</a> that gets stored in the |
parser, then your application is responsible for freeing it prior to |
calling <code>XML_ParserFree</code>.</p> |
<p>The objects returned by the parser creation functions are good for |
parsing only one XML document or external parsed entity. If your |
application needs to parse many XML documents, then it needs to create |
a parser object for each one. The best way to deal with this is to |
create a higher level object that contains all the default |
initialization you want for your parser objects.</p> |
<p>Walking through a document hierarchy with a stream oriented parser |
will require a good stack mechanism in order to keep track of current |
context. For instance, to answer the simple question, "What element |
does this text belong to?" requires a stack, since the parser may have |
descended into other elements that are children of the current one and |
has encountered this text on the way out.</p> |
<p>The things you're likely to want to keep on a stack are the |
currently opened element and it's attributes. You push this |
information onto the stack in the start handler and you pop it off in |
the end handler.</p> |
<p>For some tasks, it is sufficient to just keep information on what |
the depth of the stack is (or would be if you had one.) The outline |
program shown above presents one example. Another such task would be |
skipping over a complete element. When you see the start tag for the |
element you want to skip, you set a skip flag and record the depth at |
which the element started. When the end tag handler encounters the |
same depth, the skipped element has ended and the flag may be |
cleared. If you follow the convention that the root element starts at |
1, then you can use the same variable for skip flag and skip |
depth.</p> |
<pre class="eg"> |
void |
init_info(Parseinfo *info) { |
info->skip = 0; |
info->depth = 1; |
/* Other initializations here */ |
} /* End of init_info */ |
void XMLCALL |
rawstart(void *data, const char *el, const char **attr) { |
Parseinfo *inf = (Parseinfo *) data; |
if (! inf->skip) { |
if (should_skip(inf, el, attr)) { |
inf->skip = inf->depth; |
} |
else |
start(inf, el, attr); /* This does rest of start handling */ |
} |
inf->depth++; |
} /* End of rawstart */ |
void XMLCALL |
rawend(void *data, const char *el) { |
Parseinfo *inf = (Parseinfo *) data; |
inf->depth--; |
if (! inf->skip) |
end(inf, el); /* This does rest of end handling */ |
if (inf->skip == inf->depth) |
inf->skip = 0; |
} /* End rawend */ |
</pre> |
<p>Notice in the above example the difference in how depth is |
manipulated in the start and end handlers. The end tag handler should |
be the mirror image of the start tag handler. This is necessary to |
properly model containment. Since, in the start tag handler, we |
incremented depth <em>after</em> the main body of start tag code, then |
in the end handler, we need to manipulate it <em>before</em> the main |
body. If we'd decided to increment it first thing in the start |
handler, then we'd have had to decrement it last thing in the end |
handler.</p> |
<h3 id="userdata">Communicating between handlers</h3> |
<p>In order to be able to pass information between different handlers |
without using globals, you'll need to define a data structure to hold |
the shared variables. You can then tell Expat (with the <code><a href= |
"#XML_SetUserData" >XML_SetUserData</a></code> function) to pass a |
pointer to this structure to the handlers. This is the first |
argument received by most handlers. In the <a href="#reference" |
>reference section</a>, an argument to a callback function is named |
<code>userData</code> and have type <code>void *</code> if the user |
data is passed; it will have the type <code>XML_Parser</code> if the |
parser itself is passed. When the parser is passed, the user data may |
be retrieved using <code><a href="#XML_GetUserData" |
>XML_GetUserData</a></code>.</p> |
<p>One common case where multiple calls to a single handler may need |
to communicate using an application data structure is the case when |
content passed to the character data handler (set by <code><a href= |
"#XML_SetCharacterDataHandler" |
>XML_SetCharacterDataHandler</a></code>) needs to be accumulated. A |
common first-time mistake with any of the event-oriented interfaces to |
an XML parser is to expect all the text contained in an element to be |
reported by a single call to the character data handler. Expat, like |
many other XML parsers, reports such data as a sequence of calls; |
there's no way to know when the end of the sequence is reached until a |
different callback is made. A buffer referenced by the user data |
structure proves both an effective and convenient place to accumulate |
character data.</p> |
<!-- XXX example needed here --> |
<h3>XML Version</h3> |
<p>Expat is an XML 1.0 parser, and as such never complains based on |
the value of the <code>version</code> pseudo-attribute in the XML |
declaration, if present.</p> |
<p>If an application needs to check the version number (to support |
alternate processing), it should use the <code><a href= |
"#XML_SetXmlDeclHandler" >XML_SetXmlDeclHandler</a></code> function to |
set a handler that uses the information in the XML declaration to |
determine what to do. This example shows how to check that only a |
version number of <code>"1.0"</code> is accepted:</p> |
<pre class="eg"> |
static int wrong_version; |
static XML_Parser parser; |
static void XMLCALL |
xmldecl_handler(void *userData, |
const XML_Char *version, |
const XML_Char *encoding, |
int standalone) |
{ |
static const XML_Char Version_1_0[] = {'1', '.', '0', 0}; |
int i; |
for (i = 0; i < (sizeof(Version_1_0) / sizeof(Version_1_0[0])); ++i) { |
if (version[i] != Version_1_0[i]) { |
wrong_version = 1; |
/* also clear all other handlers: */ |
XML_SetCharacterDataHandler(parser, NULL); |
... |
return; |
} |
} |
... |
} |
</pre> |
<h3>Namespace Processing</h3> |
<p>When the parser is created using the <code><a href= |
"#XML_ParserCreateNS" >XML_ParserCreateNS</a></code>, function, Expat |
performs namespace processing. Under namespace processing, Expat |
consumes <code>xmlns</code> and <code>xmlns:...</code> attributes, |
which declare namespaces for the scope of the element in which they |
occur. This means that your start handler will not see these |
attributes. Your application can still be informed of these |
declarations by setting namespace declaration handlers with <a href= |
"#XML_SetNamespaceDeclHandler" |
><code>XML_SetNamespaceDeclHandler</code></a>.</p> |
<p>Element type and attribute names that belong to a given namespace |
are passed to the appropriate handler in expanded form. By default |
this expanded form is a concatenation of the namespace URI, the |
separator character (which is the 2nd argument to <code><a href= |
"#XML_ParserCreateNS" >XML_ParserCreateNS</a></code>), and the local |
name (i.e. the part after the colon). Names with undeclared prefixes |
are not well-formed when namespace processing is enabled, and will |
trigger an error. Unprefixed attribute names are never expanded, |
and unprefixed element names are only expanded when they are in the |
scope of a default namespace.</p> |
<p>However if <code><a href= "#XML_SetReturnNSTriplet" |
>XML_SetReturnNSTriplet</a></code> has been called with a non-zero |
<code>do_nst</code> parameter, then the expanded form for names with |
an explicit prefix is a concatenation of: URI, separator, local name, |
separator, prefix.</p> |
<p>You can set handlers for the start of a namespace declaration and |
for the end of a scope of a declaration with the <code><a href= |
"#XML_SetNamespaceDeclHandler" >XML_SetNamespaceDeclHandler</a></code> |
function. The StartNamespaceDeclHandler is called prior to the start |
tag handler and the EndNamespaceDeclHandler is called after the |
corresponding end tag that ends the namespace's scope. The namespace |
start handler gets passed the prefix and URI for the namespace. For a |
default namespace declaration (xmlns='...'), the prefix will be null. |
The URI will be null for the case where the default namespace is being |
unset. The namespace end handler just gets the prefix for the closing |
scope.</p> |
<p>These handlers are called for each declaration. So if, for |
instance, a start tag had three namespace declarations, then the |
StartNamespaceDeclHandler would be called three times before the start |
tag handler is called, once for each declaration.</p> |
<h3>Character Encodings</h3> |
<p>While XML is based on Unicode, and every XML processor is required |
to recognized UTF-8 and UTF-16 (1 and 2 byte encodings of Unicode), |
other encodings may be declared in XML documents or entities. For the |
main document, an XML declaration may contain an encoding |
declaration:</p> |
<pre> |
<?xml version="1.0" encoding="ISO-8859-2"?> |
</pre> |
<p>External parsed entities may begin with a text declaration, which |
looks like an XML declaration with just an encoding declaration:</p> |
<pre> |
<?xml encoding="Big5"?> |
</pre> |
<p>With Expat, you may also specify an encoding at the time of |
creating a parser. This is useful when the encoding information may |
come from a source outside the document itself (like a higher level |
protocol.)</p> |
<p><a name="builtin_encodings"></a>There are four built-in encodings |
in Expat:</p> |
<ul> |
<li>UTF-8</li> |
<li>UTF-16</li> |
<li>ISO-8859-1</li> |
<li>US-ASCII</li> |
</ul> |
<p>Anything else discovered in an encoding declaration or in the |
protocol encoding specified in the parser constructor, triggers a call |
to the <code>UnknownEncodingHandler</code>. This handler gets passed |
the encoding name and a pointer to an <code>XML_Encoding</code> data |
structure. Your handler must fill in this structure and return |
<code>XML_STATUS_OK</code> if it knows how to deal with the |
encoding. Otherwise the handler should return |
<code>XML_STATUS_ERROR</code>. The handler also gets passed a pointer |
to an optional application data structure that you may indicate when |
you set the handler.</p> |
<p>Expat places restrictions on character encodings that it can |
support by filling in the <code>XML_Encoding</code> structure. |
include file:</p> |
<ol> |
<li>Every ASCII character that can appear in a well-formed XML document |
must be represented by a single byte, and that byte must correspond to |
it's ASCII encoding (except for the characters $@\^'{}~)</li> |
<li>Characters must be encoded in 4 bytes or less.</li> |
<li>All characters encoded must have Unicode scalar values less than or |
equal to 65535 (0xFFFF)<em>This does not apply to the built-in support |
for UTF-16 and UTF-8</em></li> |
<li>No character may be encoded by more that one distinct sequence of |
bytes</li> |
</ol> |
<p><code>XML_Encoding</code> contains an array of integers that |
correspond to the 1st byte of an encoding sequence. If the value in |
the array for a byte is zero or positive, then the byte is a single |
byte encoding that encodes the Unicode scalar value contained in the |
array. A -1 in this array indicates a malformed byte. If the value is |
-2, -3, or -4, then the byte is the beginning of a 2, 3, or 4 byte |
sequence respectively. Multi-byte sequences are sent to the convert |
function pointed at in the <code>XML_Encoding</code> structure. This |
function should return the Unicode scalar value for the sequence or -1 |
if the sequence is malformed.</p> |
<p>One pitfall that novice Expat users are likely to fall into is that |
although Expat may accept input in various encodings, the strings that |
it passes to the handlers are always encoded in UTF-8 or UTF-16 |
(depending on how Expat was compiled). Your application is responsible |
for any translation of these strings into other encodings.</p> |
<h3>Handling External Entity References</h3> |
<p>Expat does not read or parse external entities directly. Note that |
any external DTD is a special case of an external entity. If you've |
set no <code>ExternalEntityRefHandler</code>, then external entity |
references are silently ignored. Otherwise, it calls your handler with |
the information needed to read and parse the external entity.</p> |
<p>Your handler isn't actually responsible for parsing the entity, but |
it is responsible for creating a subsidiary parser with <code><a href= |
"#XML_ExternalEntityParserCreate" |
>XML_ExternalEntityParserCreate</a></code> that will do the job. This |
returns an instance of <code>XML_Parser</code> that has handlers and |
other data structures initialized from the parent parser. You may then |
use <code><a href= "#XML_Parse" >XML_Parse</a></code> or <code><a |
href= "#XML_ParseBuffer">XML_ParseBuffer</a></code> calls against this |
parser. Since external entities my refer to other external entities, |
your handler should be prepared to be called recursively.</p> |
<h3>Parsing DTDs</h3> |
<p>In order to parse parameter entities, before starting the parse, |
you must call <code><a href= "#XML_SetParamEntityParsing" |
>XML_SetParamEntityParsing</a></code> with one of the following |
arguments:</p> |
<dl> |
<dt><code>XML_PARAM_ENTITY_PARSING_NEVER</code></dt> |
<dd>Don't parse parameter entities or the external subset</dd> |
<dt><code>XML_PARAM_ENTITY_PARSING_UNLESS_STANDALONE</code></dt> |
<dd>Parse parameter entites and the external subset unless |
<code>standalone</code> was set to "yes" in the XML declaration.</dd> |
<dt><code>XML_PARAM_ENTITY_PARSING_ALWAYS</code></dt> |
<dd>Always parse parameter entities and the external subset</dd> |
</dl> |
<p>In order to read an external DTD, you also have to set an external |
entity reference handler as described above.</p> |
<h3 id="stop-resume">Temporarily Stopping Parsing</h3> |
<p>Expat 1.95.8 introduces a new feature: its now possible to stop |
parsing temporarily from within a handler function, even if more data |
has already been passed into the parser. Applications for this |
include</p> |
<ul> |
<li>Supporting the <a href= "http://www.w3.org/TR/xinclude/" |
>XInclude</a> specification.</li> |
<li>Delaying further processing until additional information is |
available from some other source.</li> |
<li>Adjusting processor load as task priorities shift within an |
application.</li> |
<li>Stopping parsing completely (simply free or reset the parser |
instead of resuming in the outer parsing loop). This can be useful |
if a application-domain error is found in the XML being parsed or if |
the result of the parse is determined not to be useful after |
all.</li> |
</ul> |
<p>To take advantage of this feature, the main parsing loop of an |
application needs to support this specifically. It cannot be |
supported with a parsing loop compatible with Expat 1.95.7 or |
earlier (though existing loops will continue to work without |
supporting the stop/resume feature).</p> |
<p>An application that uses this feature for a single parser will have |
the rough structure (in pseudo-code):</p> |
<pre class="pseudocode"> |
fd = open_input() |
p = create_parser() |
if parse_xml(p, fd) { |
/* suspended */ |
int suspended = 1; |
while (suspended) { |
do_something_else() |
if ready_to_resume() { |
suspended = continue_parsing(p, fd); |
} |
} |
} |
</pre> |
<p>An application that may resume any of several parsers based on |
input (either from the XML being parsed or some other source) will |
certainly have more interesting control structures.</p> |
<p>This C function could be used for the <code>parse_xml</code> |
function mentioned in the pseudo-code above:</p> |
<pre class="eg"> |
#define BUFF_SIZE 10240 |
/* Parse a document from the open file descriptor 'fd' until the parse |
is complete (the document has been completely parsed, or there's |
been an error), or the parse is stopped. Return non-zero when |
the parse is merely suspended. |
*/ |
int |
parse_xml(XML_Parser p, int fd) |
{ |
for (;;) { |
int last_chunk; |
int bytes_read; |
enum XML_Status status; |
void *buff = XML_GetBuffer(p, BUFF_SIZE); |
if (buff == NULL) { |
/* handle error... */ |
return 0; |
} |
bytes_read = read(fd, buff, BUFF_SIZE); |
if (bytes_read < 0) { |
/* handle error... */ |
return 0; |
} |
status = XML_ParseBuffer(p, bytes_read, bytes_read == 0); |
switch (status) { |
case XML_STATUS_ERROR: |
/* handle error... */ |
return 0; |
case XML_STATUS_SUSPENDED: |
return 1; |
} |
if (bytes_read == 0) |
return 0; |
} |
} |
</pre> |
<p>The corresponding <code>continue_parsing</code> function is |
somewhat simpler, since it only need deal with the return code from |
<code><a href= "#XML_ResumeParser">XML_ResumeParser</a></code>; it can |
delegate the input handling to the <code>parse_xml</code> |
function:</p> |
<pre class="eg"> |
/* Continue parsing a document which had been suspended. The 'p' and |
'fd' arguments are the same as passed to parse_xml(). Return |
non-zero when the parse is suspended. |
*/ |
int |
continue_parsing(XML_Parser p, int fd) |
{ |
enum XML_Status status = XML_ResumeParser(p); |
switch (status) { |
case XML_STATUS_ERROR: |
/* handle error... */ |
return 0; |
case XML_ERROR_NOT_SUSPENDED: |
/* handle error... */ |
return 0;. |
case XML_STATUS_SUSPENDED: |
return 1; |
} |
return parse_xml(p, fd); |
} |
</pre> |
<p>Now that we've seen what a mess the top-level parsing loop can |
become, what have we gained? Very simply, we can now use the <code><a |
href= "#XML_StopParser" >XML_StopParser</a></code> function to stop |
parsing, without having to go to great lengths to avoid additional |
processing that we're expecting to ignore. As a bonus, we get to stop |
parsing <em>temporarily</em>, and come back to it when we're |
ready.</p> |
<p>To stop parsing from a handler function, use the <code><a href= |
"#XML_StopParser" >XML_StopParser</a></code> function. This function |
takes two arguments; the parser being stopped and a flag indicating |
whether the parse can be resumed in the future.</p> |
<!-- XXX really need more here --> |
<hr /> |
<!-- ================================================================ --> |
<h2><a name="reference">Expat Reference</a></h2> |
<h3><a name="creation">Parser Creation</a></h3> |
<pre class="fcndec" id="XML_ParserCreate"> |
XML_Parser XMLCALL |
XML_ParserCreate(const XML_Char *encoding); |
</pre> |
<div class="fcndef"> |
Construct a new parser. If encoding is non-null, it specifies a |
character encoding to use for the document. This overrides the document |
encoding declaration. There are four built-in encodings: |
<ul> |
<li>US-ASCII</li> |
<li>UTF-8</li> |
<li>UTF-16</li> |
<li>ISO-8859-1</li> |
</ul> |
Any other value will invoke a call to the UnknownEncodingHandler. |
</div> |
<pre class="fcndec" id="XML_ParserCreateNS"> |
XML_Parser XMLCALL |
XML_ParserCreateNS(const XML_Char *encoding, |
XML_Char sep); |
</pre> |
<div class="fcndef"> |
Constructs a new parser that has namespace processing in effect. Namespace |
expanded element names and attribute names are returned as a concatenation |
of the namespace URI, <em>sep</em>, and the local part of the name. This |
means that you should pick a character for <em>sep</em> that can't be part |
of an URI. Since Expat does not check namespace URIs for conformance, the |
only safe choice for a namespace separator is a character that is illegal |
in XML. For instance, <code>'\xFF'</code> is not legal in UTF-8, and |
<code>'\xFFFF'</code> is not legal in UTF-16. There is a special case when |
<em>sep</em> is the null character <code>'\0'</code>: the namespace URI and |
the local part will be concatenated without any separator - this is intended |
to support RDF processors. It is a programming error to use the null separator |
with <a href= "#XML_SetReturnNSTriplet">namespace triplets</a>.</div> |
<pre class="fcndec" id="XML_ParserCreate_MM"> |
XML_Parser XMLCALL |
XML_ParserCreate_MM(const XML_Char *encoding, |
const XML_Memory_Handling_Suite *ms, |
const XML_Char *sep); |
</pre> |
<pre class="signature"> |
typedef struct { |
void *(XMLCALL *malloc_fcn)(size_t size); |
void *(XMLCALL *realloc_fcn)(void *ptr, size_t size); |
void (XMLCALL *free_fcn)(void *ptr); |
} XML_Memory_Handling_Suite; |
</pre> |
<div class="fcndef"> |
<p>Construct a new parser using the suite of memory handling functions |
specified in <code>ms</code>. If <code>ms</code> is NULL, then use the |
standard set of memory management functions. If <code>sep</code> is |
non NULL, then namespace processing is enabled in the created parser |
and the character pointed at by sep is used as the separator between |
the namespace URI and the local part of the name.</p> |
</div> |
<pre class="fcndec" id="XML_ExternalEntityParserCreate"> |
XML_Parser XMLCALL |
XML_ExternalEntityParserCreate(XML_Parser p, |
const XML_Char *context, |
const XML_Char *encoding); |
</pre> |
<div class="fcndef"> |
Construct a new <code>XML_Parser</code> object for parsing an external |
general entity. Context is the context argument passed in a call to a |
ExternalEntityRefHandler. Other state information such as handlers, |
user data, namespace processing is inherited from the parser passed as |
the 1st argument. So you shouldn't need to call any of the behavior |
changing functions on this parser (unless you want it to act |
differently than the parent parser). |
</div> |
<pre class="fcndec" id="XML_ParserFree"> |
void XMLCALL |
XML_ParserFree(XML_Parser p); |
</pre> |
<div class="fcndef"> |
Free memory used by the parser. Your application is responsible for |
freeing any memory associated with <a href="#userdata">user data</a>. |
</div> |
<pre class="fcndec" id="XML_ParserReset"> |
XML_Bool XMLCALL |
XML_ParserReset(XML_Parser p, |
const XML_Char *encoding); |
</pre> |
<div class="fcndef"> |
Clean up the memory structures maintained by the parser so that it may |
be used again. After this has been called, <code>parser</code> is |
ready to start parsing a new document. All handlers are cleared from |
the parser, except for the unknownEncodingHandler. The parser's external |
state is re-initialized except for the values of ns and ns_triplets. |
This function may not be used on a parser created using <code><a href= |
"#XML_ExternalEntityParserCreate" >XML_ExternalEntityParserCreate</a |
></code>; it will return <code>XML_FALSE</code> in that case. Returns |
<code>XML_TRUE</code> on success. Your application is responsible for |
dealing with any memory associated with <a href="#userdata">user data</a>. |
</div> |
<h3><a name="parsing">Parsing</a></h3> |
<p>To state the obvious: the three parsing functions <code><a href= |
"#XML_Parse" >XML_Parse</a></code>, <code><a href= "#XML_ParseBuffer"> |
XML_ParseBuffer</a></code> and <code><a href= "#XML_GetBuffer"> |
XML_GetBuffer</a></code> must not be called from within a handler |
unless they operate on a separate parser instance, that is, one that |
did not call the handler. For example, it is OK to call the parsing |
functions from within an <code>XML_ExternalEntityRefHandler</code>, |
if they apply to the parser created by |
<code><a href= "#XML_ExternalEntityParserCreate" |
>XML_ExternalEntityParserCreate</a></code>.</p> |
<p>Note: the <code>len</code> argument passed to these functions |
should be considerably less than the maximum value for an integer, |
as it could create an integer overflow situation if the added |
lengths of a buffer and the unprocessed portion of the previous buffer |
exceed the maximum integer value. Input data at the end of a buffer |
will remain unprocessed if it is part of an XML token for which the |
end is not part of that buffer.</p> |
<pre class="fcndec" id="XML_Parse"> |
enum XML_Status XMLCALL |
XML_Parse(XML_Parser p, |
const char *s, |
int len, |
int isFinal); |
</pre> |
<pre class="signature"> |
enum XML_Status { |
XML_STATUS_ERROR = 0, |
XML_STATUS_OK = 1 |
}; |
</pre> |
<div class="fcndef"> |
Parse some more of the document. The string <code>s</code> is a buffer |
containing part (or perhaps all) of the document. The number of bytes of s |
that are part of the document is indicated by <code>len</code>. This means |
that <code>s</code> doesn't have to be null terminated. It also means that |
if <code>len</code> is larger than the number of bytes in the block of |
memory that <code>s</code> points at, then a memory fault is likely. The |
<code>isFinal</code> parameter informs the parser that this is the last |
piece of the document. Frequently, the last piece is empty (i.e. |
<code>len</code> is zero.) |
If a parse error occurred, it returns <code>XML_STATUS_ERROR</code>. |
Otherwise it returns <code>XML_STATUS_OK</code> value. |
</div> |
<pre class="fcndec" id="XML_ParseBuffer"> |
enum XML_Status XMLCALL |
XML_ParseBuffer(XML_Parser p, |
int len, |
int isFinal); |
</pre> |
<div class="fcndef"> |
This is just like <code><a href= "#XML_Parse" >XML_Parse</a></code>, |
except in this case Expat provides the buffer. By obtaining the |
buffer from Expat with the <code><a href= "#XML_GetBuffer" |
>XML_GetBuffer</a></code> function, the application can avoid double |
copying of the input. |
</div> |
<pre class="fcndec" id="XML_GetBuffer"> |
void * XMLCALL |
XML_GetBuffer(XML_Parser p, |
int len); |
</pre> |
<div class="fcndef"> |
Obtain a buffer of size <code>len</code> to read a piece of the document |
into. A NULL value is returned if Expat can't allocate enough memory for |
this buffer. This has to be called prior to every call to |
<code><a href= "#XML_ParseBuffer" >XML_ParseBuffer</a></code>. A |
typical use would look like this: |
<pre class="eg"> |
for (;;) { |
int bytes_read; |
void *buff = XML_GetBuffer(p, BUFF_SIZE); |
if (buff == NULL) { |
/* handle error */ |
} |
bytes_read = read(docfd, buff, BUFF_SIZE); |
if (bytes_read < 0) { |
/* handle error */ |
} |
if (! XML_ParseBuffer(p, bytes_read, bytes_read == 0)) { |
/* handle parse error */ |
} |
if (bytes_read == 0) |
break; |
} |
</pre> |
</div> |
<pre class="fcndec" id="XML_StopParser"> |
enum XML_Status XMLCALL |
XML_StopParser(XML_Parser p, |
XML_Bool resumable); |
</pre> |
<div class="fcndef"> |
<p>Stops parsing, causing <code><a href= "#XML_Parse" |
>XML_Parse</a></code> or <code><a href= "#XML_ParseBuffer" |
>XML_ParseBuffer</a></code> to return. Must be called from within a |
call-back handler, except when aborting (when <code>resumable</code> |
is <code>XML_FALSE</code>) an already suspended parser. Some |
call-backs may still follow because they would otherwise get |
lost, including |
<ul> |
<li> the end element handler for empty elements when stopped in the |
start element handler,</li> |
<li> the end namespace declaration handler when stopped in the end |
element handler,</li> |
<li> the character data handler when stopped in the character data handler |
while making multiple call-backs on a contiguous chunk of characters,</li> |
</ul> |
and possibly others.</p> |
<p>This can be called from most handlers, including DTD related |
call-backs, except when parsing an external parameter entity and |
<code>resumable</code> is <code>XML_TRUE</code>. Returns |
<code>XML_STATUS_OK</code> when successful, |
<code>XML_STATUS_ERROR</code> otherwise. The possible error codes |
are:</p> |
<dl> |
<dt><code>XML_ERROR_SUSPENDED</code></dt> |
<dd>when suspending an already suspended parser.</dd> |
<dt><code>XML_ERROR_FINISHED</code></dt> |
<dd>when the parser has already finished.</dd> |
<dt><code>XML_ERROR_SUSPEND_PE</code></dt> |
<dd>when suspending while parsing an external PE.</dd> |
</dl> |
<p>Since the stop/resume feature requires application support in the |
outer parsing loop, it is an error to call this function for a parser |
not being handled appropriately; see <a href= "#stop-resume" |
>Temporarily Stopping Parsing</a> for more information.</p> |
<p>When <code>resumable</code> is <code>XML_TRUE</code> then parsing |
is <em>suspended</em>, that is, <code><a href= "#XML_Parse" |
>XML_Parse</a></code> and <code><a href= "#XML_ParseBuffer" |
>XML_ParseBuffer</a></code> return <code>XML_STATUS_SUSPENDED</code>. |
Otherwise, parsing is <em>aborted</em>, that is, <code><a href= |
"#XML_Parse" >XML_Parse</a></code> and <code><a href= |
"#XML_ParseBuffer" >XML_ParseBuffer</a></code> return |
<code>XML_STATUS_ERROR</code> with error code |
<code>XML_ERROR_ABORTED</code>.</p> |
<p><strong>Note:</strong> |
This will be applied to the current parser instance only, that is, if |
there is a parent parser then it will continue parsing when the |
external entity reference handler returns. It is up to the |
implementation of that handler to call <code><a href= |
"#XML_StopParser" >XML_StopParser</a></code> on the parent parser |
(recursively), if one wants to stop parsing altogether.</p> |
<p>When suspended, parsing can be resumed by calling <code><a href= |
"#XML_ResumeParser" >XML_ResumeParser</a></code>.</p> |
<p>New in Expat 1.95.8.</p> |
</div> |
<pre class="fcndec" id="XML_ResumeParser"> |
enum XML_Status XMLCALL |
XML_ResumeParser(XML_Parser p); |
</pre> |
<div class="fcndef"> |
<p>Resumes parsing after it has been suspended with <code><a href= |
"#XML_StopParser" >XML_StopParser</a></code>. Must not be called from |
within a handler call-back. Returns same status codes as <code><a |
href= "#XML_Parse">XML_Parse</a></code> or <code><a href= |
"#XML_ParseBuffer" >XML_ParseBuffer</a></code>. An additional error |
code, <code>XML_ERROR_NOT_SUSPENDED</code>, will be returned if the |
parser was not currently suspended.</p> |
<p><strong>Note:</strong> |
This must be called on the most deeply nested child parser instance |
first, and on its parent parser only after the child parser has |
finished, to be applied recursively until the document entity's parser |
is restarted. That is, the parent parser will not resume by itself |
and it is up to the application to call <code><a href= |
"#XML_ResumeParser" >XML_ResumeParser</a></code> on it at the |
appropriate moment.</p> |
<p>New in Expat 1.95.8.</p> |
</div> |
<pre class="fcndec" id="XML_GetParsingStatus"> |
void XMLCALL |
XML_GetParsingStatus(XML_Parser p, |
XML_ParsingStatus *status); |
</pre> |
<pre class="signature"> |
enum XML_Parsing { |
XML_INITIALIZED, |
XML_PARSING, |
XML_FINISHED, |
XML_SUSPENDED |
}; |
typedef struct { |
enum XML_Parsing parsing; |
XML_Bool finalBuffer; |
} XML_ParsingStatus; |
</pre> |
<div class="fcndef"> |
<p>Returns status of parser with respect to being initialized, |
parsing, finished, or suspended, and whether the final buffer is being |
processed. The <code>status</code> parameter <em>must not</em> be |
NULL.</p> |
<p>New in Expat 1.95.8.</p> |
</div> |
<h3><a name="setting">Handler Setting</a></h3> |
<p>Although handlers are typically set prior to parsing and left alone, an |
application may choose to set or change the handler for a parsing event |
while the parse is in progress. For instance, your application may choose |
to ignore all text not descended from a <code>para</code> element. One |
way it could do this is to set the character handler when a para start tag |
is seen, and unset it for the corresponding end tag.</p> |
<p>A handler may be <em>unset</em> by providing a NULL pointer to the |
appropriate handler setter. None of the handler setting functions have |
a return value.</p> |
<p>Your handlers will be receiving strings in arrays of type |
<code>XML_Char</code>. This type is conditionally defined in expat.h as |
either <code>char</code>, <code>wchar_t</code> or <code>unsigned short</code>. |
The former implies UTF-8 encoding, the latter two imply UTF-16 encoding. |
Note that you'll receive them in this form independent of the original |
encoding of the document.</p> |
<div class="handler"> |
<pre class="setter" id="XML_SetStartElementHandler"> |
void XMLCALL |
XML_SetStartElementHandler(XML_Parser p, |
XML_StartElementHandler start); |
</pre> |
<pre class="signature"> |
typedef void |
(XMLCALL *XML_StartElementHandler)(void *userData, |
const XML_Char *name, |
const XML_Char **atts); |
</pre> |
<p>Set handler for start (and empty) tags. Attributes are passed to the start |
handler as a pointer to a vector of char pointers. Each attribute seen in |
a start (or empty) tag occupies 2 consecutive places in this vector: the |
attribute name followed by the attribute value. These pairs are terminated |
by a null pointer.</p> |
<p>Note that an empty tag generates a call to both start and end handlers |
(in that order).</p> |
</div> |
<div class="handler"> |
<pre class="setter" id="XML_SetEndElementHandler"> |
void XMLCALL |
XML_SetEndElementHandler(XML_Parser p, |
XML_EndElementHandler); |
</pre> |
<pre class="signature"> |
typedef void |
(XMLCALL *XML_EndElementHandler)(void *userData, |
const XML_Char *name); |
</pre> |
<p>Set handler for end (and empty) tags. As noted above, an empty tag |
generates a call to both start and end handlers.</p> |
</div> |
<div class="handler"> |
<pre class="setter" id="XML_SetElementHandler"> |
void XMLCALL |
XML_SetElementHandler(XML_Parser p, |
XML_StartElementHandler start, |
XML_EndElementHandler end); |
</pre> |
<p>Set handlers for start and end tags with one call.</p> |
</div> |
<div class="handler"> |
<pre class="setter" id="XML_SetCharacterDataHandler"> |
void XMLCALL |
XML_SetCharacterDataHandler(XML_Parser p, |
XML_CharacterDataHandler charhndl) |
</pre> |
<pre class="signature"> |
typedef void |
(XMLCALL *XML_CharacterDataHandler)(void *userData, |
const XML_Char *s, |
int len); |
</pre> |
<p>Set a text handler. The string your handler receives |
is <em>NOT nul-terminated</em>. You have to use the length argument |
to deal with the end of the string. A single block of contiguous text |
free of markup may still result in a sequence of calls to this handler. |
In other words, if you're searching for a pattern in the text, it may |
be split across calls to this handler. Note: Setting this handler to NULL |
may <em>NOT immediately</em> terminate call-backs if the parser is currently |
processing such a single block of contiguous markup-free text, as the parser |
will continue calling back until the end of the block is reached.</p> |
</div> |
<div class="handler"> |
<pre class="setter" id="XML_SetProcessingInstructionHandler"> |
void XMLCALL |
XML_SetProcessingInstructionHandler(XML_Parser p, |
XML_ProcessingInstructionHandler proc) |
</pre> |
<pre class="signature"> |
typedef void |
(XMLCALL *XML_ProcessingInstructionHandler)(void *userData, |
const XML_Char *target, |
const XML_Char *data); |
</pre> |
<p>Set a handler for processing instructions. The target is the first word |
in the processing instruction. The data is the rest of the characters in |
it after skipping all whitespace after the initial word.</p> |
</div> |
<div class="handler"> |
<pre class="setter" id="XML_SetCommentHandler"> |
void XMLCALL |
XML_SetCommentHandler(XML_Parser p, |
XML_CommentHandler cmnt) |
</pre> |
<pre class="signature"> |
typedef void |
(XMLCALL *XML_CommentHandler)(void *userData, |
const XML_Char *data); |
</pre> |
<p>Set a handler for comments. The data is all text inside the comment |
delimiters.</p> |
</div> |
<div class="handler"> |
<pre class="setter" id="XML_SetStartCdataSectionHandler"> |
void XMLCALL |
XML_SetStartCdataSectionHandler(XML_Parser p, |
XML_StartCdataSectionHandler start); |
</pre> |
<pre class="signature"> |
typedef void |
(XMLCALL *XML_StartCdataSectionHandler)(void *userData); |
</pre> |
<p>Set a handler that gets called at the beginning of a CDATA section.</p> |
</div> |
<div class="handler"> |
<pre class="setter" id="XML_SetEndCdataSectionHandler"> |
void XMLCALL |
XML_SetEndCdataSectionHandler(XML_Parser p, |
XML_EndCdataSectionHandler end); |
</pre> |
<pre class="signature"> |
typedef void |
(XMLCALL *XML_EndCdataSectionHandler)(void *userData); |
</pre> |
<p>Set a handler that gets called at the end of a CDATA section.</p> |
</div> |
<div class="handler"> |
<pre class="setter" id="XML_SetCdataSectionHandler"> |
void XMLCALL |
XML_SetCdataSectionHandler(XML_Parser p, |
XML_StartCdataSectionHandler start, |
XML_EndCdataSectionHandler end) |
</pre> |
<p>Sets both CDATA section handlers with one call.</p> |
</div> |
<div class="handler"> |
<pre class="setter" id="XML_SetDefaultHandler"> |
void XMLCALL |
XML_SetDefaultHandler(XML_Parser p, |
XML_DefaultHandler hndl) |
</pre> |
<pre class="signature"> |
typedef void |
(XMLCALL *XML_DefaultHandler)(void *userData, |
const XML_Char *s, |
int len); |
</pre> |
<p>Sets a handler for any characters in the document which wouldn't |
otherwise be handled. This includes both data for which no handlers |
can be set (like some kinds of DTD declarations) and data which could |
be reported but which currently has no handler set. The characters |
are passed exactly as they were present in the XML document except |
that they will be encoded in UTF-8 or UTF-16. Line boundaries are not |
normalized. Note that a byte order mark character is not passed to the |
default handler. There are no guarantees about how characters are |
divided between calls to the default handler: for example, a comment |
might be split between multiple calls. Setting the handler with |
this call has the side effect of turning off expansion of references |
to internally defined general entities. Instead these references are |
passed to the default handler.</p> |
<p>See also <code><a |
href="#XML_DefaultCurrent">XML_DefaultCurrent</a></code>.</p> |
</div> |
<div class="handler"> |
<pre class="setter" id="XML_SetDefaultHandlerExpand"> |
void XMLCALL |
XML_SetDefaultHandlerExpand(XML_Parser p, |
XML_DefaultHandler hndl) |
</pre> |
<pre class="signature"> |
typedef void |
(XMLCALL *XML_DefaultHandler)(void *userData, |
const XML_Char *s, |
int len); |
</pre> |
<p>This sets a default handler, but doesn't inhibit the expansion of |
internal entity references. The entity reference will not be passed |
to the default handler.</p> |
<p>See also <code><a |
href="#XML_DefaultCurrent">XML_DefaultCurrent</a></code>.</p> |
</div> |
<div class="handler"> |
<pre class="setter" id="XML_SetExternalEntityRefHandler"> |
void XMLCALL |
XML_SetExternalEntityRefHandler(XML_Parser p, |
XML_ExternalEntityRefHandler hndl) |
</pre> |
<pre class="signature"> |
typedef int |
(XMLCALL *XML_ExternalEntityRefHandler)(XML_Parser p, |
const XML_Char *context, |
const XML_Char *base, |
const XML_Char *systemId, |
const XML_Char *publicId); |
</pre> |
<p>Set an external entity reference handler. This handler is also |
called for processing an external DTD subset if parameter entity parsing |
is in effect. (See <a href="#XML_SetParamEntityParsing"> |
<code>XML_SetParamEntityParsing</code></a>.)</p> |
<p>The <code>context</code> parameter specifies the parsing context in |
the format expected by the <code>context</code> argument to <code><a |
href="#XML_ExternalEntityParserCreate" |
>XML_ExternalEntityParserCreate</a></code>. <code>code</code> is |
valid only until the handler returns, so if the referenced entity is |
to be parsed later, it must be copied. <code>context</code> is NULL |
only when the entity is a parameter entity, which is how one can |
differentiate between general and parameter entities.</p> |
<p>The <code>base</code> parameter is the base to use for relative |
system identifiers. It is set by <code><a |
href="#XML_SetBase">XML_SetBase</a></code> and may be NULL. The |
<code>publicId</code> parameter is the public id given in the entity |
declaration and may be NULL. <code>systemId</code> is the system |
identifier specified in the entity declaration and is never NULL.</p> |
<p>There are a couple of ways in which this handler differs from |
others. First, this handler returns a status indicator (an |
integer). <code>XML_STATUS_OK</code> should be returned for successful |
handling of the external entity reference. Returning |
<code>XML_STATUS_ERROR</code> indicates failure, and causes the |
calling parser to return an |
<code>XML_ERROR_EXTERNAL_ENTITY_HANDLING</code> error.</p> |
<p>Second, instead of having the user data as its first argument, it |
receives the parser that encountered the entity reference. This, along |
with the context parameter, may be used as arguments to a call to |
<code><a href= "#XML_ExternalEntityParserCreate" |
>XML_ExternalEntityParserCreate</a></code>. Using the returned |
parser, the body of the external entity can be recursively parsed.</p> |
<p>Since this handler may be called recursively, it should not be saving |
information into global or static variables.</p> |
</div> |
<pre class="fcndec" id="XML_SetExternalEntityRefHandlerArg"> |
void XMLCALL |
XML_SetExternalEntityRefHandlerArg(XML_Parser p, |
void *arg) |
</pre> |
<div class="fcndef"> |
<p>Set the argument passed to the ExternalEntityRefHandler. If |
<code>arg</code> is not NULL, it is the new value passed to the |
handler set using <code><a href="#XML_SetExternalEntityRefHandler" |
>XML_SetExternalEntityRefHandler</a></code>; if <code>arg</code> is |
NULL, the argument passed to the handler function will be the parser |
object itself.</p> |
<p><strong>Note:</strong> |
The type of <code>arg</code> and the type of the first argument to the |
ExternalEntityRefHandler do not match. This function takes a |
<code>void *</code> to be passed to the handler, while the handler |
accepts an <code>XML_Parser</code>. This is a historical accident, |
but will not be corrected before Expat 2.0 (at the earliest) to avoid |
causing compiler warnings for code that's known to work with this |
API. It is the responsibility of the application code to know the |
actual type of the argument passed to the handler and to manage it |
properly.</p> |
</div> |
<div class="handler"> |
<pre class="setter" id="XML_SetSkippedEntityHandler"> |
void XMLCALL |
XML_SetSkippedEntityHandler(XML_Parser p, |
XML_SkippedEntityHandler handler) |
</pre> |
<pre class="signature"> |
typedef void |
(XMLCALL *XML_SkippedEntityHandler)(void *userData, |
const XML_Char *entityName, |
int is_parameter_entity); |
</pre> |
<p>Set a skipped entity handler. This is called in two situations:</p> |
<ol> |
<li>An entity reference is encountered for which no declaration |
has been read <em>and</em> this is not an error.</li> |
<li>An internal entity reference is read, but not expanded, because |
<a href="#XML_SetDefaultHandler"><code>XML_SetDefaultHandler</code></a> |
has been called.</li> |
</ol> |
<p>The <code>is_parameter_entity</code> argument will be non-zero for |
a parameter entity and zero for a general entity.</p> <p>Note: skipped |
parameter entities in declarations and skipped general entities in |
attribute values cannot be reported, because the event would be out of |
sync with the reporting of the declarations or attribute values</p> |
</div> |
<div class="handler"> |
<pre class="setter" id="XML_SetUnknownEncodingHandler"> |
void XMLCALL |
XML_SetUnknownEncodingHandler(XML_Parser p, |
XML_UnknownEncodingHandler enchandler, |
void *encodingHandlerData) |
</pre> |
<pre class="signature"> |
typedef int |
(XMLCALL *XML_UnknownEncodingHandler)(void *encodingHandlerData, |
const XML_Char *name, |
XML_Encoding *info); |
typedef struct { |
int map[256]; |
void *data; |
int (XMLCALL *convert)(void *data, const char *s); |
void (XMLCALL *release)(void *data); |
} XML_Encoding; |
</pre> |
<p>Set a handler to deal with encodings other than the <a |
href="#builtin_encodings">built in set</a>. This should be done before |
<code><a href= "#XML_Parse" >XML_Parse</a></code> or <code><a href= |
"#XML_ParseBuffer" >XML_ParseBuffer</a></code> have been called on the |
given parser.</p> <p>If the handler knows how to deal with an encoding |
with the given name, it should fill in the <code>info</code> data |
structure and return <code>XML_STATUS_OK</code>. Otherwise it |
should return <code>XML_STATUS_ERROR</code>. The handler will be called |
at most once per parsed (external) entity. The optional application |
data pointer <code>encodingHandlerData</code> will be passed back to |
the handler.</p> |
<p>The map array contains information for every possible possible leading |
byte in a byte sequence. If the corresponding value is >= 0, then it's |
a single byte sequence and the byte encodes that Unicode value. If the |
value is -1, then that byte is invalid as the initial byte in a sequence. |
If the value is -n, where n is an integer > 1, then n is the number of |
bytes in the sequence and the actual conversion is accomplished by a |
call to the function pointed at by convert. This function may return -1 |
if the sequence itself is invalid. The convert pointer may be null if |
there are only single byte codes. The data parameter passed to the convert |
function is the data pointer from <code>XML_Encoding</code>. The |
string s is <em>NOT</em> nul-terminated and points at the sequence of |
bytes to be converted.</p> |
<p>The function pointed at by <code>release</code> is called by the |
parser when it is finished with the encoding. It may be NULL.</p> |
</div> |
<div class="handler"> |
<pre class="setter" id="XML_SetStartNamespaceDeclHandler"> |
void XMLCALL |
XML_SetStartNamespaceDeclHandler(XML_Parser p, |
XML_StartNamespaceDeclHandler start); |
</pre> |
<pre class="signature"> |
typedef void |
(XMLCALL *XML_StartNamespaceDeclHandler)(void *userData, |
const XML_Char *prefix, |
const XML_Char *uri); |
</pre> |
<p>Set a handler to be called when a namespace is declared. Namespace |
declarations occur inside start tags. But the namespace declaration start |
handler is called before the start tag handler for each namespace declared |
in that start tag.</p> |
</div> |
<div class="handler"> |
<pre class="setter" id="XML_SetEndNamespaceDeclHandler"> |
void XMLCALL |
XML_SetEndNamespaceDeclHandler(XML_Parser p, |
XML_EndNamespaceDeclHandler end); |
</pre> |
<pre class="signature"> |
typedef void |
(XMLCALL *XML_EndNamespaceDeclHandler)(void *userData, |
const XML_Char *prefix); |
</pre> |
<p>Set a handler to be called when leaving the scope of a namespace |
declaration. This will be called, for each namespace declaration, |
after the handler for the end tag of the element in which the |
namespace was declared.</p> |
</div> |
<div class="handler"> |
<pre class="setter" id="XML_SetNamespaceDeclHandler"> |
void XMLCALL |
XML_SetNamespaceDeclHandler(XML_Parser p, |
XML_StartNamespaceDeclHandler start, |
XML_EndNamespaceDeclHandler end) |
</pre> |
<p>Sets both namespace declaration handlers with a single call.</p> |
</div> |
<div class="handler"> |
<pre class="setter" id="XML_SetXmlDeclHandler"> |
void XMLCALL |
XML_SetXmlDeclHandler(XML_Parser p, |
XML_XmlDeclHandler xmldecl); |
</pre> |
<pre class="signature"> |
typedef void |
(XMLCALL *XML_XmlDeclHandler)(void *userData, |
const XML_Char *version, |
const XML_Char *encoding, |
int standalone); |
</pre> |
<p>Sets a handler that is called for XML declarations and also for |
text declarations discovered in external entities. The way to |
distinguish is that the <code>version</code> parameter will be NULL |
for text declarations. The <code>encoding</code> parameter may be NULL |
for an XML declaration. The <code>standalone</code> argument will |
contain -1, 0, or 1 indicating respectively that there was no |
standalone parameter in the declaration, that it was given as no, or |
that it was given as yes.</p> |
</div> |
<div class="handler"> |
<pre class="setter" id="XML_SetStartDoctypeDeclHandler"> |
void XMLCALL |
XML_SetStartDoctypeDeclHandler(XML_Parser p, |
XML_StartDoctypeDeclHandler start); |
</pre> |
<pre class="signature"> |
typedef void |
(XMLCALL *XML_StartDoctypeDeclHandler)(void *userData, |
const XML_Char *doctypeName, |
const XML_Char *sysid, |
const XML_Char *pubid, |
int has_internal_subset); |
</pre> |
<p>Set a handler that is called at the start of a DOCTYPE declaration, |
before any external or internal subset is parsed. Both <code>sysid</code> |
and <code>pubid</code> may be NULL. The <code>has_internal_subset</code> |
will be non-zero if the DOCTYPE declaration has an internal subset.</p> |
</div> |
<div class="handler"> |
<pre class="setter" id="XML_SetEndDoctypeDeclHandler"> |
void XMLCALL |
XML_SetEndDoctypeDeclHandler(XML_Parser p, |
XML_EndDoctypeDeclHandler end); |
</pre> |
<pre class="signature"> |
typedef void |
(XMLCALL *XML_EndDoctypeDeclHandler)(void *userData); |
</pre> |
<p>Set a handler that is called at the end of a DOCTYPE declaration, |
after parsing any external subset.</p> |
</div> |
<div class="handler"> |
<pre class="setter" id="XML_SetDoctypeDeclHandler"> |
void XMLCALL |
XML_SetDoctypeDeclHandler(XML_Parser p, |
XML_StartDoctypeDeclHandler start, |
XML_EndDoctypeDeclHandler end); |
</pre> |
<p>Set both doctype handlers with one call.</p> |
</div> |
<div class="handler"> |
<pre class="setter" id="XML_SetElementDeclHandler"> |
void XMLCALL |
XML_SetElementDeclHandler(XML_Parser p, |
XML_ElementDeclHandler eldecl); |
</pre> |
<pre class="signature"> |
typedef void |
(XMLCALL *XML_ElementDeclHandler)(void *userData, |
const XML_Char *name, |
XML_Content *model); |
</pre> |
<pre class="signature"> |
enum XML_Content_Type { |
XML_CTYPE_EMPTY = 1, |
XML_CTYPE_ANY, |
XML_CTYPE_MIXED, |
XML_CTYPE_NAME, |
XML_CTYPE_CHOICE, |
XML_CTYPE_SEQ |
}; |
enum XML_Content_Quant { |
XML_CQUANT_NONE, |
XML_CQUANT_OPT, |
XML_CQUANT_REP, |
XML_CQUANT_PLUS |
}; |
typedef struct XML_cp XML_Content; |
struct XML_cp { |
enum XML_Content_Type type; |
enum XML_Content_Quant quant; |
const XML_Char * name; |
unsigned int numchildren; |
XML_Content * children; |
}; |
</pre> |
<p>Sets a handler for element declarations in a DTD. The handler gets |
called with the name of the element in the declaration and a pointer |
to a structure that contains the element model. It is the |
application's responsibility to free this data structure using |
<code><a href="#XML_FreeContentModel" |
>XML_FreeContentModel</a></code>.</p> |
<p>The <code>model</code> argument is the root of a tree of |
<code>XML_Content</code> nodes. If <code>type</code> equals |
<code>XML_CTYPE_EMPTY</code> or <code>XML_CTYPE_ANY</code>, then |
<code>quant</code> will be <code>XML_CQUANT_NONE</code>, and the other |
fields will be zero or NULL. If <code>type</code> is |
<code>XML_CTYPE_MIXED</code>, then <code>quant</code> will be |
<code>XML_CQUANT_NONE</code> or <code>XML_CQUANT_REP</code> and |
<code>numchildren</code> will contain the number of elements that are |
allowed to be mixed in and <code>children</code> points to an array of |
<code>XML_Content</code> structures that will all have type |
XML_CTYPE_NAME with no quantification. Only the root node can be type |
<code>XML_CTYPE_EMPTY</code>, <code>XML_CTYPE_ANY</code>, or |
<code>XML_CTYPE_MIXED</code>.</p> |
<p>For type <code>XML_CTYPE_NAME</code>, the <code>name</code> field |
points to the name and the <code>numchildren</code> and |
<code>children</code> fields will be zero and NULL. The |
<code>quant</code> field will indicate any quantifiers placed on the |
name.</p> |
<p>Types <code>XML_CTYPE_CHOICE</code> and <code>XML_CTYPE_SEQ</code> |
indicate a choice or sequence respectively. The |
<code>numchildren</code> field indicates how many nodes in the choice |
or sequence and <code>children</code> points to the nodes.</p> |
</div> |
<div class="handler"> |
<pre class="setter" id="XML_SetAttlistDeclHandler"> |
void XMLCALL |
XML_SetAttlistDeclHandler(XML_Parser p, |
XML_AttlistDeclHandler attdecl); |
</pre> |
<pre class="signature"> |
typedef void |
(XMLCALL *XML_AttlistDeclHandler)(void *userData, |
const XML_Char *elname, |
const XML_Char *attname, |
const XML_Char *att_type, |
const XML_Char *dflt, |
int isrequired); |
</pre> |
<p>Set a handler for attlist declarations in the DTD. This handler is |
called for <em>each</em> attribute. So a single attlist declaration |
with multiple attributes declared will generate multiple calls to this |
handler. The <code>elname</code> parameter returns the name of the |
element for which the attribute is being declared. The attribute name |
is in the <code>attname</code> parameter. The attribute type is in the |
<code>att_type</code> parameter. It is the string representing the |
type in the declaration with whitespace removed.</p> |
<p>The <code>dflt</code> parameter holds the default value. It will be |
NULL in the case of "#IMPLIED" or "#REQUIRED" attributes. You can |
distinguish these two cases by checking the <code>isrequired</code> |
parameter, which will be true in the case of "#REQUIRED" attributes. |
Attributes which are "#FIXED" will have also have a true |
<code>isrequired</code>, but they will have the non-NULL fixed value |
in the <code>dflt</code> parameter.</p> |
</div> |
<div class="handler"> |
<pre class="setter" id="XML_SetEntityDeclHandler"> |
void XMLCALL |
XML_SetEntityDeclHandler(XML_Parser p, |
XML_EntityDeclHandler handler); |
</pre> |
<pre class="signature"> |
typedef void |
(XMLCALL *XML_EntityDeclHandler)(void *userData, |
const XML_Char *entityName, |
int is_parameter_entity, |
const XML_Char *value, |
int value_length, |
const XML_Char *base, |
const XML_Char *systemId, |
const XML_Char *publicId, |
const XML_Char *notationName); |
</pre> |
<p>Sets a handler that will be called for all entity declarations. |
The <code>is_parameter_entity</code> argument will be non-zero in the |
case of parameter entities and zero otherwise.</p> |
<p>For internal entities (<code><!ENTITY foo "bar"></code>), |
<code>value</code> will be non-NULL and <code>systemId</code>, |
<code>publicId</code>, and <code>notationName</code> will all be NULL. |
The value string is <em>not</em> NULL terminated; the length is |
provided in the <code>value_length</code> parameter. Do not use |
<code>value_length</code> to test for internal entities, since it is |
legal to have zero-length values. Instead check for whether or not |
<code>value</code> is NULL.</p> <p>The <code>notationName</code> |
argument will have a non-NULL value only for unparsed entity |
declarations.</p> |
</div> |
<div class="handler"> |
<pre class="setter" id="XML_SetUnparsedEntityDeclHandler"> |
void XMLCALL |
XML_SetUnparsedEntityDeclHandler(XML_Parser p, |
XML_UnparsedEntityDeclHandler h) |
</pre> |
<pre class="signature"> |
typedef void |
(XMLCALL *XML_UnparsedEntityDeclHandler)(void *userData, |
const XML_Char *entityName, |
const XML_Char *base, |
const XML_Char *systemId, |
const XML_Char *publicId, |
const XML_Char *notationName); |
</pre> |
<p>Set a handler that receives declarations of unparsed entities. These |
are entity declarations that have a notation (NDATA) field:</p> |
<div id="eg"><pre> |
<!ENTITY logo SYSTEM "images/logo.gif" NDATA gif> |
</pre></div> |
<p>This handler is obsolete and is provided for backwards |
compatibility. Use instead <a href= "#XML_SetEntityDeclHandler" |
>XML_SetEntityDeclHandler</a>.</p> |
</div> |
<div class="handler"> |
<pre class="setter" id="XML_SetNotationDeclHandler"> |
void XMLCALL |
XML_SetNotationDeclHandler(XML_Parser p, |
XML_NotationDeclHandler h) |
</pre> |
<pre class="signature"> |
typedef void |
(XMLCALL *XML_NotationDeclHandler)(void *userData, |
const XML_Char *notationName, |
const XML_Char *base, |
const XML_Char *systemId, |
const XML_Char *publicId); |
</pre> |
<p>Set a handler that receives notation declarations.</p> |
</div> |
<div class="handler"> |
<pre class="setter" id="XML_SetNotStandaloneHandler"> |
void XMLCALL |
XML_SetNotStandaloneHandler(XML_Parser p, |
XML_NotStandaloneHandler h) |
</pre> |
<pre class="signature"> |
typedef int |
(XMLCALL *XML_NotStandaloneHandler)(void *userData); |
</pre> |
<p>Set a handler that is called if the document is not "standalone". |
This happens when there is an external subset or a reference to a |
parameter entity, but does not have standalone set to "yes" in an XML |
declaration. If this handler returns <code>XML_STATUS_ERROR</code>, |
then the parser will throw an <code>XML_ERROR_NOT_STANDALONE</code> |
error.</p> |
</div> |
<h3><a name="position">Parse position and error reporting functions</a></h3> |
<p>These are the functions you'll want to call when the parse |
functions return <code>XML_STATUS_ERROR</code> (a parse error has |
occurred), although the position reporting functions are useful outside |
of errors. The position reported is the byte position (in the original |
document or entity encoding) of the first of the sequence of |
characters that generated the current event (or the error that caused |
the parse functions to return <code>XML_STATUS_ERROR</code>.) The |
exceptions are callbacks trigged by declarations in the document |
prologue, in which case they exact position reported is somewhere in the |
relevant markup, but not necessarily as meaningful as for other |
events.</p> |
<p>The position reporting functions are accurate only outside of the |
DTD. In other words, they usually return bogus information when |
called from within a DTD declaration handler.</p> |
<pre class="fcndec" id="XML_GetErrorCode"> |
enum XML_Error XMLCALL |
XML_GetErrorCode(XML_Parser p); |
</pre> |
<div class="fcndef"> |
Return what type of error has occurred. |
</div> |
<pre class="fcndec" id="XML_ErrorString"> |
const XML_LChar * XMLCALL |
XML_ErrorString(enum XML_Error code); |
</pre> |
<div class="fcndef"> |
Return a string describing the error corresponding to code. |
The code should be one of the enums that can be returned from |
<code><a href= "#XML_GetErrorCode" >XML_GetErrorCode</a></code>. |
</div> |
<pre class="fcndec" id="XML_GetCurrentByteIndex"> |
XML_Index XMLCALL |
XML_GetCurrentByteIndex(XML_Parser p); |
</pre> |
<div class="fcndef"> |
Return the byte offset of the position. This always corresponds to |
the values returned by <code><a href= "#XML_GetCurrentLineNumber" |
>XML_GetCurrentLineNumber</a></code> and <code><a href= |
"#XML_GetCurrentColumnNumber" >XML_GetCurrentColumnNumber</a></code>. |
</div> |
<pre class="fcndec" id="XML_GetCurrentLineNumber"> |
XML_Size XMLCALL |
XML_GetCurrentLineNumber(XML_Parser p); |
</pre> |
<div class="fcndef"> |
Return the line number of the position. The first line is reported as |
<code>1</code>. |
</div> |
<pre class="fcndec" id="XML_GetCurrentColumnNumber"> |
XML_Size XMLCALL |
XML_GetCurrentColumnNumber(XML_Parser p); |
</pre> |
<div class="fcndef"> |
Return the offset, from the beginning of the current line, of |
the position. |
</div> |
<pre class="fcndec" id="XML_GetCurrentByteCount"> |
int XMLCALL |
XML_GetCurrentByteCount(XML_Parser p); |
</pre> |
<div class="fcndef"> |
Return the number of bytes in the current event. Returns |
<code>0</code> if the event is inside a reference to an internal |
entity and for the end-tag event for empty element tags (the later can |
be used to distinguish empty-element tags from empty elements using |
separate start and end tags). |
</div> |
<pre class="fcndec" id="XML_GetInputContext"> |
const char * XMLCALL |
XML_GetInputContext(XML_Parser p, |
int *offset, |
int *size); |
</pre> |
<div class="fcndef"> |
<p>Returns the parser's input buffer, sets the integer pointed at by |
<code>offset</code> to the offset within this buffer of the current |
parse position, and set the integer pointed at by <code>size</code> to |
the size of the returned buffer.</p> |
<p>This should only be called from within a handler during an active |
parse and the returned buffer should only be referred to from within |
the handler that made the call. This input buffer contains the |
untranslated bytes of the input.</p> |
<p>Only a limited amount of context is kept, so if the event |
triggering a call spans over a very large amount of input, the actual |
parse position may be before the beginning of the buffer.</p> |
<p>If <code>XML_CONTEXT_BYTES</code> is not defined, this will always |
return NULL.</p> |
</div> |
<h3><a name="miscellaneous">Miscellaneous functions</a></h3> |
<p>The functions in this section either obtain state information from |
the parser or can be used to dynamicly set parser options.</p> |
<pre class="fcndec" id="XML_SetUserData"> |
void XMLCALL |
XML_SetUserData(XML_Parser p, |
void *userData); |
</pre> |
<div class="fcndef"> |
This sets the user data pointer that gets passed to handlers. It |
overwrites any previous value for this pointer. Note that the |
application is responsible for freeing the memory associated with |
<code>userData</code> when it is finished with the parser. So if you |
call this when there's already a pointer there, and you haven't freed |
the memory associated with it, then you've probably just leaked |
memory. |
</div> |
<pre class="fcndec" id="XML_GetUserData"> |
void * XMLCALL |
XML_GetUserData(XML_Parser p); |
</pre> |
<div class="fcndef"> |
This returns the user data pointer that gets passed to handlers. |
It is actually implemented as a macro. |
</div> |
<pre class="fcndec" id="XML_UseParserAsHandlerArg"> |
void XMLCALL |
XML_UseParserAsHandlerArg(XML_Parser p); |
</pre> |
<div class="fcndef"> |
After this is called, handlers receive the parser in their |
<code>userData</code> arguments. The user data can still be obtained |
using the <code><a href= "#XML_GetUserData" |
>XML_GetUserData</a></code> function. |
</div> |
<pre class="fcndec" id="XML_SetBase"> |
enum XML_Status XMLCALL |
XML_SetBase(XML_Parser p, |
const XML_Char *base); |
</pre> |
<div class="fcndef"> |
Set the base to be used for resolving relative URIs in system |
identifiers. The return value is <code>XML_STATUS_ERROR</code> if |
there's no memory to store base, otherwise it's |
<code>XML_STATUS_OK</code>. |
</div> |
<pre class="fcndec" id="XML_GetBase"> |
const XML_Char * XMLCALL |
XML_GetBase(XML_Parser p); |
</pre> |
<div class="fcndef"> |
Return the base for resolving relative URIs. |
</div> |
<pre class="fcndec" id="XML_GetSpecifiedAttributeCount"> |
int XMLCALL |
XML_GetSpecifiedAttributeCount(XML_Parser p); |
</pre> |
<div class="fcndef"> |
When attributes are reported to the start handler in the atts vector, |
attributes that were explicitly set in the element occur before any |
attributes that receive their value from default information in an |
ATTLIST declaration. This function returns the number of attributes |
that were explicitly set times two, thus giving the offset in the |
<code>atts</code> array passed to the start tag handler of the first |
attribute set due to defaults. It supplies information for the last |
call to a start handler. If called inside a start handler, then that |
means the current call. |
</div> |
<pre class="fcndec" id="XML_GetIdAttributeIndex"> |
int XMLCALL |
XML_GetIdAttributeIndex(XML_Parser p); |
</pre> |
<div class="fcndef"> |
Returns the index of the ID attribute passed in the atts array in the |
last call to <code><a href= "#XML_StartElementHandler" |
>XML_StartElementHandler</a></code>, or -1 if there is no ID |
attribute. If called inside a start handler, then that means the |
current call. |
</div> |
<pre class="fcndec" id="XML_GetAttributeInfo"> |
const XML_AttrInfo * XMLCALL |
XML_GetAttributeInfo(XML_Parser parser); |
</pre> |
<pre class="signature"> |
typedef struct { |
XML_Index nameStart; /* Offset to beginning of the attribute name. */ |
XML_Index nameEnd; /* Offset after the attribute name's last byte. */ |
XML_Index valueStart; /* Offset to beginning of the attribute value. */ |
XML_Index valueEnd; /* Offset after the attribute value's last byte. */ |
} XML_AttrInfo; |
</pre> |
<div class="fcndef"> |
Returns an array of <code>XML_AttrInfo</code> structures for the |
attribute/value pairs passed in the last call to the |
<code>XML_StartElementHandler</code> that were specified |
in the start-tag rather than defaulted. Each attribute/value pair counts |
as 1; thus the number of entries in the array is |
<code>XML_GetSpecifiedAttributeCount(parser) / 2</code>. |
</div> |
<pre class="fcndec" id="XML_SetEncoding"> |
enum XML_Status XMLCALL |
XML_SetEncoding(XML_Parser p, |
const XML_Char *encoding); |
</pre> |
<div class="fcndef"> |
Set the encoding to be used by the parser. It is equivalent to |
passing a non-null encoding argument to the parser creation functions. |
It must not be called after <code><a href= "#XML_Parse" |
>XML_Parse</a></code> or <code><a href= "#XML_ParseBuffer" |
>XML_ParseBuffer</a></code> have been called on the given parser. |
Returns <code>XML_STATUS_OK</code> on success or |
<code>XML_STATUS_ERROR</code> on error. |
</div> |
<pre class="fcndec" id="XML_SetParamEntityParsing"> |
int XMLCALL |
XML_SetParamEntityParsing(XML_Parser p, |
enum XML_ParamEntityParsing code); |
</pre> |
<div class="fcndef"> |
This enables parsing of parameter entities, including the external |
parameter entity that is the external DTD subset, according to |
<code>code</code>. |
The choices for <code>code</code> are: |
<ul> |
<li><code>XML_PARAM_ENTITY_PARSING_NEVER</code></li> |
<li><code>XML_PARAM_ENTITY_PARSING_UNLESS_STANDALONE</code></li> |
<li><code>XML_PARAM_ENTITY_PARSING_ALWAYS</code></li> |
</ul> |
<b>Note:</b> If <code>XML_SetParamEntityParsing</code> is called after |
<code>XML_Parse</code> or <code>XML_ParseBuffer</code>, then it has |
no effect and will always return 0. |
</div> |
<pre class="fcndec" id="XML_SetHashSalt"> |
int XMLCALL |
XML_SetHashSalt(XML_Parser p, |
unsigned long hash_salt); |
</pre> |
<div class="fcndef"> |
Sets the hash salt to use for internal hash calculations. |
Helps in preventing DoS attacks based on predicting hash |
function behavior. In order to have an effect this must be called |
before parsing has started. Returns 1 if successful, 0 when called |
after <code>XML_Parse</code> or <code>XML_ParseBuffer</code>. |
<p><b>Note:</b> This call is optional, as the parser will auto-generate a new |
random salt value if no value has been set at the start of parsing.</p> |
</div> |
<pre class="fcndec" id="XML_UseForeignDTD"> |
enum XML_Error XMLCALL |
XML_UseForeignDTD(XML_Parser parser, XML_Bool useDTD); |
</pre> |
<div class="fcndef"> |
<p>This function allows an application to provide an external subset |
for the document type declaration for documents which do not specify |
an external subset of their own. For documents which specify an |
external subset in their DOCTYPE declaration, the application-provided |
subset will be ignored. If the document does not contain a DOCTYPE |
declaration at all and <code>useDTD</code> is true, the |
application-provided subset will be parsed, but the |
<code>startDoctypeDeclHandler</code> and |
<code>endDoctypeDeclHandler</code> functions, if set, will not be |
called. The setting of parameter entity parsing, controlled using |
<code><a href= "#XML_SetParamEntityParsing" |
>XML_SetParamEntityParsing</a></code>, will be honored.</p> |
<p>The application-provided external subset is read by calling the |
external entity reference handler set via <code><a href= |
"#XML_SetExternalEntityRefHandler" |
>XML_SetExternalEntityRefHandler</a></code> with both |
<code>publicId</code> and <code>systemId</code> set to NULL.</p> |
<p>If this function is called after parsing has begun, it returns |
<code>XML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSING</code> and ignores |
<code>useDTD</code>. If called when Expat has been compiled without |
DTD support, it returns |
<code>XML_ERROR_FEATURE_REQUIRES_XML_DTD</code>. Otherwise, it |
returns <code>XML_ERROR_NONE</code>.</p> |
<p><b>Note:</b> For the purpose of checking WFC: Entity Declared, passing |
<code>useDTD == XML_TRUE</code> will make the parser behave as if |
the document had a DTD with an external subset. This holds true even if |
the external entity reference handler returns without action.</p> |
</div> |
<pre class="fcndec" id="XML_SetReturnNSTriplet"> |
void XMLCALL |
XML_SetReturnNSTriplet(XML_Parser parser, |
int do_nst); |
</pre> |
<div class="fcndef"> |
<p> |
This function only has an effect when using a parser created with |
<code><a href= "#XML_ParserCreateNS" >XML_ParserCreateNS</a></code>, |
i.e. when namespace processing is in effect. The <code>do_nst</code> |
sets whether or not prefixes are returned with names qualified with a |
namespace prefix. If this function is called with <code>do_nst</code> |
non-zero, then afterwards namespace qualified names (that is qualified |
with a prefix as opposed to belonging to a default namespace) are |
returned as a triplet with the three parts separated by the namespace |
separator specified when the parser was created. The order of |
returned parts is URI, local name, and prefix.</p> <p>If |
<code>do_nst</code> is zero, then namespaces are reported in the |
default manner, URI then local_name separated by the namespace |
separator.</p> |
</div> |
<pre class="fcndec" id="XML_DefaultCurrent"> |
void XMLCALL |
XML_DefaultCurrent(XML_Parser parser); |
</pre> |
<div class="fcndef"> |
This can be called within a handler for a start element, end element, |
processing instruction or character data. It causes the corresponding |
markup to be passed to the default handler set by <code><a |
href="#XML_SetDefaultHandler" >XML_SetDefaultHandler</a></code> or |
<code><a href="#XML_SetDefaultHandlerExpand" |
>XML_SetDefaultHandlerExpand</a></code>. It does nothing if there is |
not a default handler. |
</div> |
<pre class="fcndec" id="XML_ExpatVersion"> |
XML_LChar * XMLCALL |
XML_ExpatVersion(); |
</pre> |
<div class="fcndef"> |
Return the library version as a string (e.g. <code>"expat_1.95.1"</code>). |
</div> |
<pre class="fcndec" id="XML_ExpatVersionInfo"> |
struct XML_Expat_Version XMLCALL |
XML_ExpatVersionInfo(); |
</pre> |
<pre class="signature"> |
typedef struct { |
int major; |
int minor; |
int micro; |
} XML_Expat_Version; |
</pre> |
<div class="fcndef"> |
Return the library version information as a structure. |
Some macros are also defined that support compile-time tests of the |
library version: |
<ul> |
<li><code>XML_MAJOR_VERSION</code></li> |
<li><code>XML_MINOR_VERSION</code></li> |
<li><code>XML_MICRO_VERSION</code></li> |
</ul> |
Testing these constants is currently the best way to determine if |
particular parts of the Expat API are available. |
</div> |
<pre class="fcndec" id="XML_GetFeatureList"> |
const XML_Feature * XMLCALL |
XML_GetFeatureList(); |
</pre> |
<pre class="signature"> |
enum XML_FeatureEnum { |
XML_FEATURE_END = 0, |
XML_FEATURE_UNICODE, |
XML_FEATURE_UNICODE_WCHAR_T, |
XML_FEATURE_DTD, |
XML_FEATURE_CONTEXT_BYTES, |
XML_FEATURE_MIN_SIZE, |
XML_FEATURE_SIZEOF_XML_CHAR, |
XML_FEATURE_SIZEOF_XML_LCHAR, |
XML_FEATURE_NS, |
XML_FEATURE_LARGE_SIZE |
}; |
typedef struct { |
enum XML_FeatureEnum feature; |
XML_LChar *name; |
long int value; |
} XML_Feature; |
</pre> |
<div class="fcndef"> |
<p>Returns a list of "feature" records, providing details on how |
Expat was configured at compile time. Most applications should not |
need to worry about this, but this information is otherwise not |
available from Expat. This function allows code that does need to |
check these features to do so at runtime.</p> |
<p>The return value is an array of <code>XML_Feature</code>, |
terminated by a record with a <code>feature</code> of |
<code>XML_FEATURE_END</code> and <code>name</code> of NULL, |
identifying the feature-test macros Expat was compiled with. Since an |
application that requires this kind of information needs to determine |
the type of character the <code>name</code> points to, records for the |
<code>XML_FEATURE_SIZEOF_XML_CHAR</code> and |
<code>XML_FEATURE_SIZEOF_XML_LCHAR</code> will be located at the |
beginning of the list, followed by <code>XML_FEATURE_UNICODE</code> |
and <code>XML_FEATURE_UNICODE_WCHAR_T</code>, if they are present at |
all.</p> |
<p>Some features have an associated value. If there isn't an |
associated value, the <code>value</code> field is set to 0. At this |
time, the following features have been defined to have values:</p> |
<dl> |
<dt><code>XML_FEATURE_SIZEOF_XML_CHAR</code></dt> |
<dd>The number of bytes occupied by one <code>XML_Char</code> |
character.</dd> |
<dt><code>XML_FEATURE_SIZEOF_XML_LCHAR</code></dt> |
<dd>The number of bytes occupied by one <code>XML_LChar</code> |
character.</dd> |
<dt><code>XML_FEATURE_CONTEXT_BYTES</code></dt> |
<dd>The maximum number of characters of context which can be |
reported by <code><a href= "#XML_GetInputContext" |
>XML_GetInputContext</a></code>.</dd> |
</dl> |
</div> |
<pre class="fcndec" id="XML_FreeContentModel"> |
void XMLCALL |
XML_FreeContentModel(XML_Parser parser, XML_Content *model); |
</pre> |
<div class="fcndef"> |
Function to deallocate the <code>model</code> argument passed to the |
<code>XML_ElementDeclHandler</code> callback set using <code><a |
href="#XML_SetElementDeclHandler" >XML_ElementDeclHandler</a></code>. |
This function should not be used for any other purpose. |
</div> |
<p>The following functions allow external code to share the memory |
allocator an <code>XML_Parser</code> has been configured to use. This |
is especially useful for third-party libraries that interact with a |
parser object created by application code, or heavily layered |
applications. This can be essential when using dynamically loaded |
libraries which use different C standard libraries (this can happen on |
Windows, at least).</p> |
<pre class="fcndec" id="XML_MemMalloc"> |
void * XMLCALL |
XML_MemMalloc(XML_Parser parser, size_t size); |
</pre> |
<div class="fcndef"> |
Allocate <code>size</code> bytes of memory using the allocator the |
<code>parser</code> object has been configured to use. Returns a |
pointer to the memory or NULL on failure. Memory allocated in this |
way must be freed using <code><a href="#XML_MemFree" |
>XML_MemFree</a></code>. |
</div> |
<pre class="fcndec" id="XML_MemRealloc"> |
void * XMLCALL |
XML_MemRealloc(XML_Parser parser, void *ptr, size_t size); |
</pre> |
<div class="fcndef"> |
Allocate <code>size</code> bytes of memory using the allocator the |
<code>parser</code> object has been configured to use. |
<code>ptr</code> must point to a block of memory allocated by <code><a |
href="#XML_MemMalloc" >XML_MemMalloc</a></code> or |
<code>XML_MemRealloc</code>, or be NULL. This function tries to |
expand the block pointed to by <code>ptr</code> if possible. Returns |
a pointer to the memory or NULL on failure. On success, the original |
block has either been expanded or freed. On failure, the original |
block has not been freed; the caller is responsible for freeing the |
original block. Memory allocated in this way must be freed using |
<code><a href="#XML_MemFree" |
>XML_MemFree</a></code>. |
</div> |
<pre class="fcndec" id="XML_MemFree"> |
void XMLCALL |
XML_MemFree(XML_Parser parser, void *ptr); |
</pre> |
<div class="fcndef"> |
Free a block of memory pointed to by <code>ptr</code>. The block must |
have been allocated by <code><a href="#XML_MemMalloc" |
>XML_MemMalloc</a></code> or <code>XML_MemRealloc</code>, or be NULL. |
</div> |
<hr /> |
<p><a href="http://validator.w3.org/check/referer"><img |
src="valid-xhtml10.png" alt="Valid XHTML 1.0!" |
height="31" width="88" class="noborder" /></a></p> |
</div> |
</body> |
</html> |
/contrib/sdk/sources/expat/doc/style.css |
---|
0,0 → 1,101 |
body { |
background-color: white; |
border: 0px; |
margin: 0px; |
padding: 0px; |
} |
.corner { |
width: 200px; |
height: 80px; |
text-align: center; |
} |
.banner { |
background-color: rgb(110,139,61); |
color: rgb(255,236,176); |
padding-left: 2em; |
} |
.banner h1 { |
font-size: 200%; |
} |
.content { |
padding: 0em 2em 1em 2em; |
} |
.releaseno { |
background-color: rgb(110,139,61); |
color: rgb(255,236,176); |
padding-bottom: 0.3em; |
padding-top: 0.5em; |
text-align: center; |
font-weight: bold; |
} |
.noborder { |
border-width: 0px; |
} |
.eg { |
padding-left: 1em; |
padding-top: .5em; |
padding-bottom: .5em; |
border: solid thin; |
margin: 1em 0; |
background-color: tan; |
margin-left: 2em; |
margin-right: 10%; |
} |
.pseudocode { |
padding-left: 1em; |
padding-top: .5em; |
padding-bottom: .5em; |
border: solid thin; |
margin: 1em 0; |
background-color: rgb(250,220,180); |
margin-left: 2em; |
margin-right: 10%; |
} |
.handler { |
width: 100%; |
border-top-width: thin; |
margin-bottom: 1em; |
} |
.handler p { |
margin-left: 2em; |
} |
.setter { |
font-weight: bold; |
} |
.signature { |
color: navy; |
} |
.fcndec { |
width: 100%; |
border-top-width: thin; |
font-weight: bold; |
} |
.fcndef { |
margin-left: 2em; |
margin-bottom: 2em; |
} |
dd { |
margin-bottom: 2em; |
} |
.cpp-symbols dt { |
font-family: monospace; |
} |
.cpp-symbols dd { |
margin-bottom: 1em; |
} |
/contrib/sdk/sources/expat/doc/valid-xhtml10.png |
---|
Cannot display: file marked as a binary type. |
svn:mime-type = application/octet-stream |
Property changes: |
Added: svn:mime-type |
+application/octet-stream |
\ No newline at end of property |
/contrib/sdk/sources/expat/doc/xmlwf.sgml |
---|
0,0 → 1,468 |
<!doctype refentry PUBLIC "-//OASIS//DTD DocBook V4.1//EN" [ |
<!-- Process this file with docbook-to-man to generate an nroff manual |
page: `docbook-to-man manpage.sgml > manpage.1'. You may view |
the manual page with: `docbook-to-man manpage.sgml | nroff -man | |
less'. A typical entry in a Makefile or Makefile.am is: |
manpage.1: manpage.sgml |
docbook-to-man $< > $@ |
--> |
<!-- Fill in your name for FIRSTNAME and SURNAME. --> |
<!ENTITY dhfirstname "<firstname>Scott</firstname>"> |
<!ENTITY dhsurname "<surname>Bronson</surname>"> |
<!-- Please adjust the date whenever revising the manpage. --> |
<!ENTITY dhdate "<date>December 5, 2001</date>"> |
<!-- SECTION should be 1-8, maybe w/ subsection other parameters are |
allowed: see man(7), man(1). --> |
<!ENTITY dhsection "<manvolnum>1</manvolnum>"> |
<!ENTITY dhemail "<email>bronson@rinspin.com</email>"> |
<!ENTITY dhusername "Scott Bronson"> |
<!ENTITY dhucpackage "<refentrytitle>XMLWF</refentrytitle>"> |
<!ENTITY dhpackage "xmlwf"> |
<!ENTITY debian "<productname>Debian GNU/Linux</productname>"> |
<!ENTITY gnu "<acronym>GNU</acronym>"> |
]> |
<refentry> |
<refentryinfo> |
<address> |
&dhemail; |
</address> |
<author> |
&dhfirstname; |
&dhsurname; |
</author> |
<copyright> |
<year>2001</year> |
<holder>&dhusername;</holder> |
</copyright> |
&dhdate; |
</refentryinfo> |
<refmeta> |
&dhucpackage; |
&dhsection; |
</refmeta> |
<refnamediv> |
<refname>&dhpackage;</refname> |
<refpurpose>Determines if an XML document is well-formed</refpurpose> |
</refnamediv> |
<refsynopsisdiv> |
<cmdsynopsis> |
<command>&dhpackage;</command> |
<arg><option>-s</option></arg> |
<arg><option>-n</option></arg> |
<arg><option>-p</option></arg> |
<arg><option>-x</option></arg> |
<arg><option>-e <replaceable>encoding</replaceable></option></arg> |
<arg><option>-w</option></arg> |
<arg><option>-d <replaceable>output-dir</replaceable></option></arg> |
<arg><option>-c</option></arg> |
<arg><option>-m</option></arg> |
<arg><option>-r</option></arg> |
<arg><option>-t</option></arg> |
<arg><option>-v</option></arg> |
<arg>file ...</arg> |
</cmdsynopsis> |
</refsynopsisdiv> |
<refsect1> |
<title>DESCRIPTION</title> |
<para> |
<command>&dhpackage;</command> uses the Expat library to |
determine if an XML document is well-formed. It is |
non-validating. |
</para> |
<para> |
If you do not specify any files on the command-line, and you |
have a recent version of <command>&dhpackage;</command>, the |
input file will be read from standard input. |
</para> |
</refsect1> |
<refsect1> |
<title>WELL-FORMED DOCUMENTS</title> |
<para> |
A well-formed document must adhere to the |
following rules: |
</para> |
<itemizedlist> |
<listitem><para> |
The file begins with an XML declaration. For instance, |
<literal><?xml version="1.0" standalone="yes"?></literal>. |
<emphasis>NOTE:</emphasis> |
<command>&dhpackage;</command> does not currently |
check for a valid XML declaration. |
</para></listitem> |
<listitem><para> |
Every start tag is either empty (<tag/>) |
or has a corresponding end tag. |
</para></listitem> |
<listitem><para> |
There is exactly one root element. This element must contain |
all other elements in the document. Only comments, white |
space, and processing instructions may come after the close |
of the root element. |
</para></listitem> |
<listitem><para> |
All elements nest properly. |
</para></listitem> |
<listitem><para> |
All attribute values are enclosed in quotes (either single |
or double). |
</para></listitem> |
</itemizedlist> |
<para> |
If the document has a DTD, and it strictly complies with that |
DTD, then the document is also considered <emphasis>valid</emphasis>. |
<command>&dhpackage;</command> is a non-validating parser -- |
it does not check the DTD. However, it does support |
external entities (see the <option>-x</option> option). |
</para> |
</refsect1> |
<refsect1> |
<title>OPTIONS</title> |
<para> |
When an option includes an argument, you may specify the argument either |
separately ("<option>-d</option> output") or concatenated with the |
option ("<option>-d</option>output"). <command>&dhpackage;</command> |
supports both. |
</para> |
<variablelist> |
<varlistentry> |
<term><option>-c</option></term> |
<listitem> |
<para> |
If the input file is well-formed and <command>&dhpackage;</command> |
doesn't encounter any errors, the input file is simply copied to |
the output directory unchanged. |
This implies no namespaces (turns off <option>-n</option>) and |
requires <option>-d</option> to specify an output file. |
</para> |
</listitem> |
</varlistentry> |
<varlistentry> |
<term><option>-d output-dir</option></term> |
<listitem> |
<para> |
Specifies a directory to contain transformed |
representations of the input files. |
By default, <option>-d</option> outputs a canonical representation |
(described below). |
You can select different output formats using <option>-c</option> |
and <option>-m</option>. |
</para> |
<para> |
The output filenames will |
be exactly the same as the input filenames or "STDIN" if the input is |
coming from standard input. Therefore, you must be careful that the |
output file does not go into the same directory as the input |
file. Otherwise, <command>&dhpackage;</command> will delete the |
input file before it generates the output file (just like running |
<literal>cat < file > file</literal> in most shells). |
</para> |
<para> |
Two structurally equivalent XML documents have a byte-for-byte |
identical canonical XML representation. |
Note that ignorable white space is considered significant and |
is treated equivalently to data. |
More on canonical XML can be found at |
http://www.jclark.com/xml/canonxml.html . |
</para> |
</listitem> |
</varlistentry> |
<varlistentry> |
<term><option>-e encoding</option></term> |
<listitem> |
<para> |
Specifies the character encoding for the document, overriding |
any document encoding declaration. <command>&dhpackage;</command> |
supports four built-in encodings: |
<literal>US-ASCII</literal>, |
<literal>UTF-8</literal>, |
<literal>UTF-16</literal>, and |
<literal>ISO-8859-1</literal>. |
Also see the <option>-w</option> option. |
</para> |
</listitem> |
</varlistentry> |
<varlistentry> |
<term><option>-m</option></term> |
<listitem> |
<para> |
Outputs some strange sort of XML file that completely |
describes the the input file, including character postitions. |
Requires <option>-d</option> to specify an output file. |
</para> |
</listitem> |
</varlistentry> |
<varlistentry> |
<term><option>-n</option></term> |
<listitem> |
<para> |
Turns on namespace processing. (describe namespaces) |
<option>-c</option> disables namespaces. |
</para> |
</listitem> |
</varlistentry> |
<varlistentry> |
<term><option>-p</option></term> |
<listitem> |
<para> |
Tells xmlwf to process external DTDs and parameter |
entities. |
</para> |
<para> |
Normally <command>&dhpackage;</command> never parses parameter |
entities. <option>-p</option> tells it to always parse them. |
<option>-p</option> implies <option>-x</option>. |
</para> |
</listitem> |
</varlistentry> |
<varlistentry> |
<term><option>-r</option></term> |
<listitem> |
<para> |
Normally <command>&dhpackage;</command> memory-maps the XML file |
before parsing; this can result in faster parsing on many |
platforms. |
<option>-r</option> turns off memory-mapping and uses normal file |
IO calls instead. |
Of course, memory-mapping is automatically turned off |
when reading from standard input. |
</para> |
<para> |
Use of memory-mapping can cause some platforms to report |
substantially higher memory usage for |
<command>&dhpackage;</command>, but this appears to be a matter of |
the operating system reporting memory in a strange way; there is |
not a leak in <command>&dhpackage;</command>. |
</para> |
</listitem> |
</varlistentry> |
<varlistentry> |
<term><option>-s</option></term> |
<listitem> |
<para> |
Prints an error if the document is not standalone. |
A document is standalone if it has no external subset and no |
references to parameter entities. |
</para> |
</listitem> |
</varlistentry> |
<varlistentry> |
<term><option>-t</option></term> |
<listitem> |
<para> |
Turns on timings. This tells Expat to parse the entire file, |
but not perform any processing. |
This gives a fairly accurate idea of the raw speed of Expat itself |
without client overhead. |
<option>-t</option> turns off most of the output options |
(<option>-d</option>, <option>-m</option>, <option>-c</option>, |
...). |
</para> |
</listitem> |
</varlistentry> |
<varlistentry> |
<term><option>-v</option></term> |
<listitem> |
<para> |
Prints the version of the Expat library being used, including some |
information on the compile-time configuration of the library, and |
then exits. |
</para> |
</listitem> |
</varlistentry> |
<varlistentry> |
<term><option>-w</option></term> |
<listitem> |
<para> |
Enables support for Windows code pages. |
Normally, <command>&dhpackage;</command> will throw an error if it |
runs across an encoding that it is not equipped to handle itself. With |
<option>-w</option>, &dhpackage; will try to use a Windows code |
page. See also <option>-e</option>. |
</para> |
</listitem> |
</varlistentry> |
<varlistentry> |
<term><option>-x</option></term> |
<listitem> |
<para> |
Turns on parsing external entities. |
</para> |
<para> |
Non-validating parsers are not required to resolve external |
entities, or even expand entities at all. |
Expat always expands internal entities (?), |
but external entity parsing must be enabled explicitly. |
</para> |
<para> |
External entities are simply entities that obtain their |
data from outside the XML file currently being parsed. |
</para> |
<para> |
This is an example of an internal entity: |
<literallayout> |
<!ENTITY vers '1.0.2'> |
</literallayout> |
</para> |
<para> |
And here are some examples of external entities: |
<literallayout> |
<!ENTITY header SYSTEM "header-&vers;.xml"> (parsed) |
<!ENTITY logo SYSTEM "logo.png" PNG> (unparsed) |
</literallayout> |
</para> |
</listitem> |
</varlistentry> |
<varlistentry> |
<term><option>--</option></term> |
<listitem> |
<para> |
(Two hyphens.) |
Terminates the list of options. This is only needed if a filename |
starts with a hyphen. For example: |
</para> |
<literallayout> |
&dhpackage; -- -myfile.xml |
</literallayout> |
<para> |
will run <command>&dhpackage;</command> on the file |
<filename>-myfile.xml</filename>. |
</para> |
</listitem> |
</varlistentry> |
</variablelist> |
<para> |
Older versions of <command>&dhpackage;</command> do not support |
reading from standard input. |
</para> |
</refsect1> |
<refsect1> |
<title>OUTPUT</title> |
<para> |
If an input file is not well-formed, |
<command>&dhpackage;</command> prints a single line describing |
the problem to standard output. If a file is well formed, |
<command>&dhpackage;</command> outputs nothing. |
Note that the result code is <emphasis>not</emphasis> set. |
</para> |
</refsect1> |
<refsect1> |
<title>BUGS</title> |
<para> |
<command>&dhpackage;</command> returns a 0 - noerr result, |
even if the file is not well-formed. There is no good way for |
a program to use <command>&dhpackage;</command> to quickly |
check a file -- it must parse <command>&dhpackage;</command>'s |
standard output. |
</para> |
<para> |
The errors should go to standard error, not standard output. |
</para> |
<para> |
There should be a way to get <option>-d</option> to send its |
output to standard output rather than forcing the user to send |
it to a file. |
</para> |
<para> |
I have no idea why anyone would want to use the |
<option>-d</option>, <option>-c</option>, and |
<option>-m</option> options. If someone could explain it to |
me, I'd like to add this information to this manpage. |
</para> |
</refsect1> |
<refsect1> |
<title>ALTERNATIVES</title> |
<para> |
Here are some XML validators on the web: |
<literallayout> |
http://www.hcrc.ed.ac.uk/~richard/xml-check.html |
http://www.stg.brown.edu/service/xmlvalid/ |
http://www.scripting.com/frontier5/xml/code/xmlValidator.html |
http://www.xml.com/pub/a/tools/ruwf/check.html |
</literallayout> |
</para> |
</refsect1> |
<refsect1> |
<title>SEE ALSO</title> |
<para> |
<literallayout> |
The Expat home page: http://www.libexpat.org/ |
The W3 XML specification: http://www.w3.org/TR/REC-xml |
</literallayout> |
</para> |
</refsect1> |
<refsect1> |
<title>AUTHOR</title> |
<para> |
This manual page was written by &dhusername; &dhemail; for |
the &debian; system (but may be used by others). Permission is |
granted to copy, distribute and/or modify this document under |
the terms of the <acronym>GNU</acronym> Free Documentation |
License, Version 1.1. |
</para> |
</refsect1> |
</refentry> |
<!-- Keep this comment at the end of the file |
Local variables: |
mode: sgml |
sgml-omittag:t |
sgml-shorttag:t |
sgml-minimize-attributes:nil |
sgml-always-quote-attributes:t |
sgml-indent-step:2 |
sgml-indent-data:t |
sgml-parent-document:nil |
sgml-default-dtd-file:nil |
sgml-exposed-tags:nil |
sgml-local-catalogs:nil |
sgml-local-ecat-files:nil |
End: |
--> |
/contrib/sdk/sources/expat/examples/elements.c |
---|
0,0 → 1,65 |
/* This is simple demonstration of how to use expat. This program |
reads an XML document from standard input and writes a line with |
the name of each element to standard output indenting child |
elements by one tab stop more than their parent element. |
It must be used with Expat compiled for UTF-8 output. |
*/ |
#include <stdio.h> |
#include "expat.h" |
#if defined(__amigaos__) && defined(__USE_INLINE__) |
#include <proto/expat.h> |
#endif |
#ifdef XML_LARGE_SIZE |
#if defined(XML_USE_MSC_EXTENSIONS) && _MSC_VER < 1400 |
#define XML_FMT_INT_MOD "I64" |
#else |
#define XML_FMT_INT_MOD "ll" |
#endif |
#else |
#define XML_FMT_INT_MOD "l" |
#endif |
static void XMLCALL |
startElement(void *userData, const char *name, const char **atts) |
{ |
int i; |
int *depthPtr = (int *)userData; |
for (i = 0; i < *depthPtr; i++) |
putchar('\t'); |
puts(name); |
*depthPtr += 1; |
} |
static void XMLCALL |
endElement(void *userData, const char *name) |
{ |
int *depthPtr = (int *)userData; |
*depthPtr -= 1; |
} |
int |
main(int argc, char *argv[]) |
{ |
char buf[BUFSIZ]; |
XML_Parser parser = XML_ParserCreate(NULL); |
int done; |
int depth = 0; |
XML_SetUserData(parser, &depth); |
XML_SetElementHandler(parser, startElement, endElement); |
do { |
int len = (int)fread(buf, 1, sizeof(buf), stdin); |
done = len < sizeof(buf); |
if (XML_Parse(parser, buf, len, done) == XML_STATUS_ERROR) { |
fprintf(stderr, |
"%s at line %" XML_FMT_INT_MOD "u\n", |
XML_ErrorString(XML_GetErrorCode(parser)), |
XML_GetCurrentLineNumber(parser)); |
return 1; |
} |
} while (!done); |
XML_ParserFree(parser); |
return 0; |
} |
/contrib/sdk/sources/expat/examples/elements.dsp |
---|
0,0 → 1,103 |
# Microsoft Developer Studio Project File - Name="elements" - Package Owner=<4> |
# Microsoft Developer Studio Generated Build File, Format Version 6.00 |
# ** DO NOT EDIT ** |
# TARGTYPE "Win32 (x86) Console Application" 0x0103 |
CFG=elements - Win32 Debug |
!MESSAGE This is not a valid makefile. To build this project using NMAKE, |
!MESSAGE use the Export Makefile command and run |
!MESSAGE |
!MESSAGE NMAKE /f "elements.mak". |
!MESSAGE |
!MESSAGE You can specify a configuration when running NMAKE |
!MESSAGE by defining the macro CFG on the command line. For example: |
!MESSAGE |
!MESSAGE NMAKE /f "elements.mak" CFG="elements - Win32 Debug" |
!MESSAGE |
!MESSAGE Possible choices for configuration are: |
!MESSAGE |
!MESSAGE "elements - Win32 Release" (based on "Win32 (x86) Console Application") |
!MESSAGE "elements - Win32 Debug" (based on "Win32 (x86) Console Application") |
!MESSAGE |
# Begin Project |
# PROP AllowPerConfigDependencies 0 |
# PROP Scc_ProjName "" |
# PROP Scc_LocalPath "" |
CPP=cl.exe |
RSC=rc.exe |
!IF "$(CFG)" == "elements - Win32 Release" |
# PROP BASE Use_MFC 0 |
# PROP BASE Use_Debug_Libraries 0 |
# PROP BASE Output_Dir "Release" |
# PROP BASE Intermediate_Dir "Release" |
# PROP BASE Target_Dir "" |
# PROP Use_MFC 0 |
# PROP Use_Debug_Libraries 0 |
# PROP Output_Dir "..\win32\bin\Release" |
# PROP Intermediate_Dir "..\win32\tmp\Release-elements" |
# PROP Ignore_Export_Lib 0 |
# PROP Target_Dir "" |
# ADD BASE CPP /nologo /W3 /GX /O2 /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /c |
# ADD CPP /nologo /MT /W3 /GX /O2 /I "..\lib" /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_MBCS" /D "XML_STATIC" /FD /c |
# SUBTRACT CPP /X /YX |
# ADD BASE RSC /l 0x409 /d "NDEBUG" |
# ADD RSC /l 0x409 /d "NDEBUG" |
BSC32=bscmake.exe |
# ADD BASE BSC32 /nologo |
# ADD BSC32 /nologo |
LINK32=link.exe |
# ADD BASE LINK32 /nologo /subsystem:console /machine:I386 |
# ADD LINK32 libexpatMT.lib /nologo /subsystem:console /pdb:none /machine:I386 /libpath:"..\win32\bin\Release" /out:"..\win32\bin\Release\elements.exe" |
!ELSEIF "$(CFG)" == "elements - Win32 Debug" |
# PROP BASE Use_MFC 0 |
# PROP BASE Use_Debug_Libraries 1 |
# PROP BASE Output_Dir "Debug" |
# PROP BASE Intermediate_Dir "Debug" |
# PROP BASE Target_Dir "" |
# PROP Use_MFC 0 |
# PROP Use_Debug_Libraries 1 |
# PROP Output_Dir "..\win32\bin\Debug" |
# PROP Intermediate_Dir "..\win32\tmp\Debug-elements" |
# PROP Ignore_Export_Lib 0 |
# PROP Target_Dir "" |
# ADD BASE CPP /nologo /W3 /Gm /GX /ZI /Od /D "WIN32" /D "_DEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /GZ /c |
# ADD CPP /nologo /MTd /W3 /GX /ZI /Od /I "..\lib" /D "WIN32" /D "_DEBUG" /D "_CONSOLE" /D "_MBCS" /D "XML_STATIC" /FR /FD /GZ /c |
# ADD BASE RSC /l 0x409 /d "_DEBUG" |
# ADD RSC /l 0x409 /d "_DEBUG" |
BSC32=bscmake.exe |
# ADD BASE BSC32 /nologo |
# ADD BSC32 /nologo |
LINK32=link.exe |
# ADD BASE LINK32 /nologo /subsystem:console /debug /machine:I386 /pdbtype:sept |
# ADD LINK32 libexpatMT.lib /nologo /subsystem:console /pdb:none /debug /machine:I386 /libpath:"..\win32\bin\Debug" /out:"..\win32\bin\Debug\elements.exe" |
!ENDIF |
# Begin Target |
# Name "elements - Win32 Release" |
# Name "elements - Win32 Debug" |
# Begin Group "Source Files" |
# PROP Default_Filter "cpp;c;cxx;rc;def;r;odl;idl;hpj;bat" |
# Begin Source File |
SOURCE=.\elements.c |
# End Source File |
# End Group |
# Begin Group "Header Files" |
# PROP Default_Filter "h;hpp;hxx;hm;inl" |
# End Group |
# Begin Group "Resource Files" |
# PROP Default_Filter "ico;cur;bmp;dlg;rc2;rct;bin;rgs;gif;jpg;jpeg;jpe" |
# End Group |
# End Target |
# End Project |
/contrib/sdk/sources/expat/examples/outline.c |
---|
0,0 → 1,106 |
/***************************************************************** |
* outline.c |
* |
* Copyright 1999, Clark Cooper |
* All rights reserved. |
* |
* This program is free software; you can redistribute it and/or |
* modify it under the terms of the license contained in the |
* COPYING file that comes with the expat distribution. |
* |
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, |
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF |
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. |
* IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY |
* CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, |
* TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE |
* SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. |
* |
* Read an XML document from standard input and print an element |
* outline on standard output. |
* Must be used with Expat compiled for UTF-8 output. |
*/ |
#include <stdio.h> |
#include <expat.h> |
#if defined(__amigaos__) && defined(__USE_INLINE__) |
#include <proto/expat.h> |
#endif |
#ifdef XML_LARGE_SIZE |
#if defined(XML_USE_MSC_EXTENSIONS) && _MSC_VER < 1400 |
#define XML_FMT_INT_MOD "I64" |
#else |
#define XML_FMT_INT_MOD "ll" |
#endif |
#else |
#define XML_FMT_INT_MOD "l" |
#endif |
#define BUFFSIZE 8192 |
char Buff[BUFFSIZE]; |
int Depth; |
static void XMLCALL |
start(void *data, const char *el, const char **attr) |
{ |
int i; |
for (i = 0; i < Depth; i++) |
printf(" "); |
printf("%s", el); |
for (i = 0; attr[i]; i += 2) { |
printf(" %s='%s'", attr[i], attr[i + 1]); |
} |
printf("\n"); |
Depth++; |
} |
static void XMLCALL |
end(void *data, const char *el) |
{ |
Depth--; |
} |
int |
main(int argc, char *argv[]) |
{ |
XML_Parser p = XML_ParserCreate(NULL); |
if (! p) { |
fprintf(stderr, "Couldn't allocate memory for parser\n"); |
exit(-1); |
} |
XML_SetElementHandler(p, start, end); |
for (;;) { |
int done; |
int len; |
len = (int)fread(Buff, 1, BUFFSIZE, stdin); |
if (ferror(stdin)) { |
fprintf(stderr, "Read error\n"); |
exit(-1); |
} |
done = feof(stdin); |
if (XML_Parse(p, Buff, len, done) == XML_STATUS_ERROR) { |
fprintf(stderr, "Parse error at line %" XML_FMT_INT_MOD "u:\n%s\n", |
XML_GetCurrentLineNumber(p), |
XML_ErrorString(XML_GetErrorCode(p))); |
exit(-1); |
} |
if (done) |
break; |
} |
XML_ParserFree(p); |
return 0; |
} |
/contrib/sdk/sources/expat/examples/outline.dsp |
---|
0,0 → 1,103 |
# Microsoft Developer Studio Project File - Name="outline" - Package Owner=<4> |
# Microsoft Developer Studio Generated Build File, Format Version 6.00 |
# ** DO NOT EDIT ** |
# TARGTYPE "Win32 (x86) Console Application" 0x0103 |
CFG=outline - Win32 Debug |
!MESSAGE This is not a valid makefile. To build this project using NMAKE, |
!MESSAGE use the Export Makefile command and run |
!MESSAGE |
!MESSAGE NMAKE /f "outline.mak". |
!MESSAGE |
!MESSAGE You can specify a configuration when running NMAKE |
!MESSAGE by defining the macro CFG on the command line. For example: |
!MESSAGE |
!MESSAGE NMAKE /f "outline.mak" CFG="outline - Win32 Debug" |
!MESSAGE |
!MESSAGE Possible choices for configuration are: |
!MESSAGE |
!MESSAGE "outline - Win32 Release" (based on "Win32 (x86) Console Application") |
!MESSAGE "outline - Win32 Debug" (based on "Win32 (x86) Console Application") |
!MESSAGE |
# Begin Project |
# PROP AllowPerConfigDependencies 0 |
# PROP Scc_ProjName "" |
# PROP Scc_LocalPath "" |
CPP=cl.exe |
RSC=rc.exe |
!IF "$(CFG)" == "outline - Win32 Release" |
# PROP BASE Use_MFC 0 |
# PROP BASE Use_Debug_Libraries 0 |
# PROP BASE Output_Dir "Release" |
# PROP BASE Intermediate_Dir "Release" |
# PROP BASE Target_Dir "" |
# PROP Use_MFC 0 |
# PROP Use_Debug_Libraries 0 |
# PROP Output_Dir "..\win32\bin\Release" |
# PROP Intermediate_Dir "..\win32\tmp\Release-outline" |
# PROP Ignore_Export_Lib 0 |
# PROP Target_Dir "" |
# ADD BASE CPP /nologo /W3 /GX /O2 /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /c |
# ADD CPP /nologo /MT /W3 /GX /O2 /I "..\lib" /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_MBCS" /FD /c |
# SUBTRACT CPP /X /YX |
# ADD BASE RSC /l 0x409 /d "NDEBUG" |
# ADD RSC /l 0x409 /d "NDEBUG" |
BSC32=bscmake.exe |
# ADD BASE BSC32 /nologo |
# ADD BSC32 /nologo |
LINK32=link.exe |
# ADD BASE LINK32 /nologo /subsystem:console /machine:I386 |
# ADD LINK32 libexpat.lib /nologo /subsystem:console /pdb:none /machine:I386 /libpath:"..\win32\bin\Release" /out:"..\win32\bin\Release\outline.exe" |
!ELSEIF "$(CFG)" == "outline - Win32 Debug" |
# PROP BASE Use_MFC 0 |
# PROP BASE Use_Debug_Libraries 1 |
# PROP BASE Output_Dir "Debug" |
# PROP BASE Intermediate_Dir "Debug" |
# PROP BASE Target_Dir "" |
# PROP Use_MFC 0 |
# PROP Use_Debug_Libraries 1 |
# PROP Output_Dir "..\win32\bin\Debug" |
# PROP Intermediate_Dir "..\win32\tmp\Debug-outline" |
# PROP Ignore_Export_Lib 0 |
# PROP Target_Dir "" |
# ADD BASE CPP /nologo /W3 /Gm /GX /ZI /Od /D "WIN32" /D "_DEBUG" /D "_CONSOLE" /D "_MBCS" /YX /FD /GZ /c |
# ADD CPP /nologo /MTd /W3 /Gm /GX /ZI /Od /I "..\lib" /D "WIN32" /D "_DEBUG" /D "_CONSOLE" /D "_MBCS" /FD /GZ /c |
# ADD BASE RSC /l 0x409 /d "_DEBUG" |
# ADD RSC /l 0x409 /d "_DEBUG" |
BSC32=bscmake.exe |
# ADD BASE BSC32 /nologo |
# ADD BSC32 /nologo |
LINK32=link.exe |
# ADD BASE LINK32 /nologo /subsystem:console /debug /machine:I386 /pdbtype:sept |
# ADD LINK32 libexpat.lib /nologo /subsystem:console /pdb:none /debug /machine:I386 /libpath:"..\win32\bin\Debug" /out:"..\win32\bin\Debug\outline.exe" |
!ENDIF |
# Begin Target |
# Name "outline - Win32 Release" |
# Name "outline - Win32 Debug" |
# Begin Group "Source Files" |
# PROP Default_Filter "cpp;c;cxx;rc;def;r;odl;idl;hpj;bat" |
# Begin Source File |
SOURCE=.\outline.c |
# End Source File |
# End Group |
# Begin Group "Header Files" |
# PROP Default_Filter "h;hpp;hxx;hm;inl" |
# End Group |
# Begin Group "Resource Files" |
# PROP Default_Filter "ico;cur;bmp;dlg;rc2;rct;bin;rgs;gif;jpg;jpeg;jpe" |
# End Group |
# End Target |
# End Project |
/contrib/sdk/sources/expat/expat_config.h |
---|
0,0 → 1,103 |
/* expat_config.h. Generated from expat_config.h.in by configure. */ |
/* expat_config.h.in. Generated from configure.in by autoheader. */ |
/* 1234 = LIL_ENDIAN, 4321 = BIGENDIAN */ |
#define BYTEORDER 1234 |
/* Define to 1 if you have the `bcopy' function. */ |
/* #undef HAVE_BCOPY */ |
/* Define to 1 if you have the <dlfcn.h> header file. */ |
/* #undef HAVE_DLFCN_H */ |
/* Define to 1 if you have the <fcntl.h> header file. */ |
#define HAVE_FCNTL_H 1 |
/* Define to 1 if you have the `getpagesize' function. */ |
#define HAVE_GETPAGESIZE 1 |
/* Define to 1 if you have the <inttypes.h> header file. */ |
#define HAVE_INTTYPES_H 1 |
/* Define to 1 if you have the `memmove' function. */ |
#define HAVE_MEMMOVE 1 |
/* Define to 1 if you have the <memory.h> header file. */ |
#define HAVE_MEMORY_H 1 |
/* Define to 1 if you have a working `mmap' system call. */ |
/* #undef HAVE_MMAP */ |
/* Define to 1 if you have the <stdint.h> header file. */ |
#define HAVE_STDINT_H 1 |
/* Define to 1 if you have the <stdlib.h> header file. */ |
#define HAVE_STDLIB_H 1 |
/* Define to 1 if you have the <strings.h> header file. */ |
#define HAVE_STRINGS_H 1 |
/* Define to 1 if you have the <string.h> header file. */ |
#define HAVE_STRING_H 1 |
/* Define to 1 if you have the <sys/param.h> header file. */ |
#define HAVE_SYS_PARAM_H 1 |
/* Define to 1 if you have the <sys/stat.h> header file. */ |
#define HAVE_SYS_STAT_H 1 |
/* Define to 1 if you have the <sys/types.h> header file. */ |
#define HAVE_SYS_TYPES_H 1 |
/* Define to 1 if you have the <unistd.h> header file. */ |
#define HAVE_UNISTD_H 1 |
/* Define to the sub-directory in which libtool stores uninstalled libraries. |
*/ |
#define LT_OBJDIR ".libs/" |
/* Define to the address where bug reports for this package should be sent. */ |
#define PACKAGE_BUGREPORT "expat-bugs@libexpat.org" |
/* Define to the full name of this package. */ |
#define PACKAGE_NAME "expat" |
/* Define to the full name and version of this package. */ |
#define PACKAGE_STRING "expat 2.1.0" |
/* Define to the one symbol short name of this package. */ |
#define PACKAGE_TARNAME "expat" |
/* Define to the home page for this package. */ |
#define PACKAGE_URL "" |
/* Define to the version of this package. */ |
#define PACKAGE_VERSION "2.1.0" |
/* Define to 1 if you have the ANSI C header files. */ |
#define STDC_HEADERS 1 |
/* whether byteorder is bigendian */ |
/* #undef WORDS_BIGENDIAN */ |
/* Define to specify how much context to retain around the current parse |
point. */ |
#define XML_CONTEXT_BYTES 1024 |
/* Define to make parameter entity parsing functionality available. */ |
#define XML_DTD 1 |
/* Define to make XML Namespaces functionality available. */ |
#define XML_NS 1 |
/* Define to __FUNCTION__ or "" if `__func__' does not conform to ANSI C. */ |
/* #undef __func__ */ |
/* Define to empty if `const' does not conform to ANSI C. */ |
/* #undef const */ |
/* Define to `long int' if <sys/types.h> does not define. */ |
/* #undef off_t */ |
/* Define to `unsigned int' if <sys/types.h> does not define. */ |
/* #undef size_t */ |
/contrib/sdk/sources/expat/lib/amigaconfig.h |
---|
0,0 → 1,32 |
#ifndef AMIGACONFIG_H |
#define AMIGACONFIG_H |
/* 1234 = LIL_ENDIAN, 4321 = BIGENDIAN */ |
#define BYTEORDER 4321 |
/* Define to 1 if you have the `bcopy' function. */ |
#define HAVE_BCOPY 1 |
/* Define to 1 if you have the <check.h> header file. */ |
#undef HAVE_CHECK_H |
/* Define to 1 if you have the `memmove' function. */ |
#define HAVE_MEMMOVE 1 |
/* Define to 1 if you have the <unistd.h> header file. */ |
#define HAVE_UNISTD_H 1 |
/* whether byteorder is bigendian */ |
#define WORDS_BIGENDIAN |
/* Define to specify how much context to retain around the current parse |
point. */ |
#define XML_CONTEXT_BYTES 1024 |
/* Define to make parameter entity parsing functionality available. */ |
#define XML_DTD |
/* Define to make XML Namespaces functionality available. */ |
#define XML_NS |
#endif /* AMIGACONFIG_H */ |
/contrib/sdk/sources/expat/lib/ascii.h |
---|
0,0 → 1,92 |
/* Copyright (c) 1998, 1999 Thai Open Source Software Center Ltd |
See the file COPYING for copying permission. |
*/ |
#define ASCII_A 0x41 |
#define ASCII_B 0x42 |
#define ASCII_C 0x43 |
#define ASCII_D 0x44 |
#define ASCII_E 0x45 |
#define ASCII_F 0x46 |
#define ASCII_G 0x47 |
#define ASCII_H 0x48 |
#define ASCII_I 0x49 |
#define ASCII_J 0x4A |
#define ASCII_K 0x4B |
#define ASCII_L 0x4C |
#define ASCII_M 0x4D |
#define ASCII_N 0x4E |
#define ASCII_O 0x4F |
#define ASCII_P 0x50 |
#define ASCII_Q 0x51 |
#define ASCII_R 0x52 |
#define ASCII_S 0x53 |
#define ASCII_T 0x54 |
#define ASCII_U 0x55 |
#define ASCII_V 0x56 |
#define ASCII_W 0x57 |
#define ASCII_X 0x58 |
#define ASCII_Y 0x59 |
#define ASCII_Z 0x5A |
#define ASCII_a 0x61 |
#define ASCII_b 0x62 |
#define ASCII_c 0x63 |
#define ASCII_d 0x64 |
#define ASCII_e 0x65 |
#define ASCII_f 0x66 |
#define ASCII_g 0x67 |
#define ASCII_h 0x68 |
#define ASCII_i 0x69 |
#define ASCII_j 0x6A |
#define ASCII_k 0x6B |
#define ASCII_l 0x6C |
#define ASCII_m 0x6D |
#define ASCII_n 0x6E |
#define ASCII_o 0x6F |
#define ASCII_p 0x70 |
#define ASCII_q 0x71 |
#define ASCII_r 0x72 |
#define ASCII_s 0x73 |
#define ASCII_t 0x74 |
#define ASCII_u 0x75 |
#define ASCII_v 0x76 |
#define ASCII_w 0x77 |
#define ASCII_x 0x78 |
#define ASCII_y 0x79 |
#define ASCII_z 0x7A |
#define ASCII_0 0x30 |
#define ASCII_1 0x31 |
#define ASCII_2 0x32 |
#define ASCII_3 0x33 |
#define ASCII_4 0x34 |
#define ASCII_5 0x35 |
#define ASCII_6 0x36 |
#define ASCII_7 0x37 |
#define ASCII_8 0x38 |
#define ASCII_9 0x39 |
#define ASCII_TAB 0x09 |
#define ASCII_SPACE 0x20 |
#define ASCII_EXCL 0x21 |
#define ASCII_QUOT 0x22 |
#define ASCII_AMP 0x26 |
#define ASCII_APOS 0x27 |
#define ASCII_MINUS 0x2D |
#define ASCII_PERIOD 0x2E |
#define ASCII_COLON 0x3A |
#define ASCII_SEMI 0x3B |
#define ASCII_LT 0x3C |
#define ASCII_EQUALS 0x3D |
#define ASCII_GT 0x3E |
#define ASCII_LSQB 0x5B |
#define ASCII_RSQB 0x5D |
#define ASCII_UNDERSCORE 0x5F |
#define ASCII_LPAREN 0x28 |
#define ASCII_RPAREN 0x29 |
#define ASCII_FF 0x0C |
#define ASCII_SLASH 0x2F |
#define ASCII_HASH 0x23 |
#define ASCII_PIPE 0x7C |
#define ASCII_COMMA 0x2C |
/contrib/sdk/sources/expat/lib/asciitab.h |
---|
0,0 → 1,36 |
/* Copyright (c) 1998, 1999 Thai Open Source Software Center Ltd |
See the file COPYING for copying permission. |
*/ |
/* 0x00 */ BT_NONXML, BT_NONXML, BT_NONXML, BT_NONXML, |
/* 0x04 */ BT_NONXML, BT_NONXML, BT_NONXML, BT_NONXML, |
/* 0x08 */ BT_NONXML, BT_S, BT_LF, BT_NONXML, |
/* 0x0C */ BT_NONXML, BT_CR, BT_NONXML, BT_NONXML, |
/* 0x10 */ BT_NONXML, BT_NONXML, BT_NONXML, BT_NONXML, |
/* 0x14 */ BT_NONXML, BT_NONXML, BT_NONXML, BT_NONXML, |
/* 0x18 */ BT_NONXML, BT_NONXML, BT_NONXML, BT_NONXML, |
/* 0x1C */ BT_NONXML, BT_NONXML, BT_NONXML, BT_NONXML, |
/* 0x20 */ BT_S, BT_EXCL, BT_QUOT, BT_NUM, |
/* 0x24 */ BT_OTHER, BT_PERCNT, BT_AMP, BT_APOS, |
/* 0x28 */ BT_LPAR, BT_RPAR, BT_AST, BT_PLUS, |
/* 0x2C */ BT_COMMA, BT_MINUS, BT_NAME, BT_SOL, |
/* 0x30 */ BT_DIGIT, BT_DIGIT, BT_DIGIT, BT_DIGIT, |
/* 0x34 */ BT_DIGIT, BT_DIGIT, BT_DIGIT, BT_DIGIT, |
/* 0x38 */ BT_DIGIT, BT_DIGIT, BT_COLON, BT_SEMI, |
/* 0x3C */ BT_LT, BT_EQUALS, BT_GT, BT_QUEST, |
/* 0x40 */ BT_OTHER, BT_HEX, BT_HEX, BT_HEX, |
/* 0x44 */ BT_HEX, BT_HEX, BT_HEX, BT_NMSTRT, |
/* 0x48 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0x4C */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0x50 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0x54 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0x58 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_LSQB, |
/* 0x5C */ BT_OTHER, BT_RSQB, BT_OTHER, BT_NMSTRT, |
/* 0x60 */ BT_OTHER, BT_HEX, BT_HEX, BT_HEX, |
/* 0x64 */ BT_HEX, BT_HEX, BT_HEX, BT_NMSTRT, |
/* 0x68 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0x6C */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0x70 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0x74 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0x78 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_OTHER, |
/* 0x7C */ BT_VERBAR, BT_OTHER, BT_OTHER, BT_OTHER, |
/contrib/sdk/sources/expat/lib/expat.h |
---|
0,0 → 1,1047 |
/* Copyright (c) 1998, 1999, 2000 Thai Open Source Software Center Ltd |
See the file COPYING for copying permission. |
*/ |
#ifndef Expat_INCLUDED |
#define Expat_INCLUDED 1 |
#ifdef __VMS |
/* 0 1 2 3 0 1 2 3 |
1234567890123456789012345678901 1234567890123456789012345678901 */ |
#define XML_SetProcessingInstructionHandler XML_SetProcessingInstrHandler |
#define XML_SetUnparsedEntityDeclHandler XML_SetUnparsedEntDeclHandler |
#define XML_SetStartNamespaceDeclHandler XML_SetStartNamespcDeclHandler |
#define XML_SetExternalEntityRefHandlerArg XML_SetExternalEntRefHandlerArg |
#endif |
#include <stdlib.h> |
#include "expat_external.h" |
#ifdef __cplusplus |
extern "C" { |
#endif |
struct XML_ParserStruct; |
typedef struct XML_ParserStruct *XML_Parser; |
/* Should this be defined using stdbool.h when C99 is available? */ |
typedef unsigned char XML_Bool; |
#define XML_TRUE ((XML_Bool) 1) |
#define XML_FALSE ((XML_Bool) 0) |
/* The XML_Status enum gives the possible return values for several |
API functions. The preprocessor #defines are included so this |
stanza can be added to code that still needs to support older |
versions of Expat 1.95.x: |
#ifndef XML_STATUS_OK |
#define XML_STATUS_OK 1 |
#define XML_STATUS_ERROR 0 |
#endif |
Otherwise, the #define hackery is quite ugly and would have been |
dropped. |
*/ |
enum XML_Status { |
XML_STATUS_ERROR = 0, |
#define XML_STATUS_ERROR XML_STATUS_ERROR |
XML_STATUS_OK = 1, |
#define XML_STATUS_OK XML_STATUS_OK |
XML_STATUS_SUSPENDED = 2 |
#define XML_STATUS_SUSPENDED XML_STATUS_SUSPENDED |
}; |
enum XML_Error { |
XML_ERROR_NONE, |
XML_ERROR_NO_MEMORY, |
XML_ERROR_SYNTAX, |
XML_ERROR_NO_ELEMENTS, |
XML_ERROR_INVALID_TOKEN, |
XML_ERROR_UNCLOSED_TOKEN, |
XML_ERROR_PARTIAL_CHAR, |
XML_ERROR_TAG_MISMATCH, |
XML_ERROR_DUPLICATE_ATTRIBUTE, |
XML_ERROR_JUNK_AFTER_DOC_ELEMENT, |
XML_ERROR_PARAM_ENTITY_REF, |
XML_ERROR_UNDEFINED_ENTITY, |
XML_ERROR_RECURSIVE_ENTITY_REF, |
XML_ERROR_ASYNC_ENTITY, |
XML_ERROR_BAD_CHAR_REF, |
XML_ERROR_BINARY_ENTITY_REF, |
XML_ERROR_ATTRIBUTE_EXTERNAL_ENTITY_REF, |
XML_ERROR_MISPLACED_XML_PI, |
XML_ERROR_UNKNOWN_ENCODING, |
XML_ERROR_INCORRECT_ENCODING, |
XML_ERROR_UNCLOSED_CDATA_SECTION, |
XML_ERROR_EXTERNAL_ENTITY_HANDLING, |
XML_ERROR_NOT_STANDALONE, |
XML_ERROR_UNEXPECTED_STATE, |
XML_ERROR_ENTITY_DECLARED_IN_PE, |
XML_ERROR_FEATURE_REQUIRES_XML_DTD, |
XML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSING, |
/* Added in 1.95.7. */ |
XML_ERROR_UNBOUND_PREFIX, |
/* Added in 1.95.8. */ |
XML_ERROR_UNDECLARING_PREFIX, |
XML_ERROR_INCOMPLETE_PE, |
XML_ERROR_XML_DECL, |
XML_ERROR_TEXT_DECL, |
XML_ERROR_PUBLICID, |
XML_ERROR_SUSPENDED, |
XML_ERROR_NOT_SUSPENDED, |
XML_ERROR_ABORTED, |
XML_ERROR_FINISHED, |
XML_ERROR_SUSPEND_PE, |
/* Added in 2.0. */ |
XML_ERROR_RESERVED_PREFIX_XML, |
XML_ERROR_RESERVED_PREFIX_XMLNS, |
XML_ERROR_RESERVED_NAMESPACE_URI |
}; |
enum XML_Content_Type { |
XML_CTYPE_EMPTY = 1, |
XML_CTYPE_ANY, |
XML_CTYPE_MIXED, |
XML_CTYPE_NAME, |
XML_CTYPE_CHOICE, |
XML_CTYPE_SEQ |
}; |
enum XML_Content_Quant { |
XML_CQUANT_NONE, |
XML_CQUANT_OPT, |
XML_CQUANT_REP, |
XML_CQUANT_PLUS |
}; |
/* If type == XML_CTYPE_EMPTY or XML_CTYPE_ANY, then quant will be |
XML_CQUANT_NONE, and the other fields will be zero or NULL. |
If type == XML_CTYPE_MIXED, then quant will be NONE or REP and |
numchildren will contain number of elements that may be mixed in |
and children point to an array of XML_Content cells that will be |
all of XML_CTYPE_NAME type with no quantification. |
If type == XML_CTYPE_NAME, then the name points to the name, and |
the numchildren field will be zero and children will be NULL. The |
quant fields indicates any quantifiers placed on the name. |
CHOICE and SEQ will have name NULL, the number of children in |
numchildren and children will point, recursively, to an array |
of XML_Content cells. |
The EMPTY, ANY, and MIXED types will only occur at top level. |
*/ |
typedef struct XML_cp XML_Content; |
struct XML_cp { |
enum XML_Content_Type type; |
enum XML_Content_Quant quant; |
XML_Char * name; |
unsigned int numchildren; |
XML_Content * children; |
}; |
/* This is called for an element declaration. See above for |
description of the model argument. It's the caller's responsibility |
to free model when finished with it. |
*/ |
typedef void (XMLCALL *XML_ElementDeclHandler) (void *userData, |
const XML_Char *name, |
XML_Content *model); |
XMLPARSEAPI(void) |
XML_SetElementDeclHandler(XML_Parser parser, |
XML_ElementDeclHandler eldecl); |
/* The Attlist declaration handler is called for *each* attribute. So |
a single Attlist declaration with multiple attributes declared will |
generate multiple calls to this handler. The "default" parameter |
may be NULL in the case of the "#IMPLIED" or "#REQUIRED" |
keyword. The "isrequired" parameter will be true and the default |
value will be NULL in the case of "#REQUIRED". If "isrequired" is |
true and default is non-NULL, then this is a "#FIXED" default. |
*/ |
typedef void (XMLCALL *XML_AttlistDeclHandler) ( |
void *userData, |
const XML_Char *elname, |
const XML_Char *attname, |
const XML_Char *att_type, |
const XML_Char *dflt, |
int isrequired); |
XMLPARSEAPI(void) |
XML_SetAttlistDeclHandler(XML_Parser parser, |
XML_AttlistDeclHandler attdecl); |
/* The XML declaration handler is called for *both* XML declarations |
and text declarations. The way to distinguish is that the version |
parameter will be NULL for text declarations. The encoding |
parameter may be NULL for XML declarations. The standalone |
parameter will be -1, 0, or 1 indicating respectively that there |
was no standalone parameter in the declaration, that it was given |
as no, or that it was given as yes. |
*/ |
typedef void (XMLCALL *XML_XmlDeclHandler) (void *userData, |
const XML_Char *version, |
const XML_Char *encoding, |
int standalone); |
XMLPARSEAPI(void) |
XML_SetXmlDeclHandler(XML_Parser parser, |
XML_XmlDeclHandler xmldecl); |
typedef struct { |
void *(*malloc_fcn)(size_t size); |
void *(*realloc_fcn)(void *ptr, size_t size); |
void (*free_fcn)(void *ptr); |
} XML_Memory_Handling_Suite; |
/* Constructs a new parser; encoding is the encoding specified by the |
external protocol or NULL if there is none specified. |
*/ |
XMLPARSEAPI(XML_Parser) |
XML_ParserCreate(const XML_Char *encoding); |
/* Constructs a new parser and namespace processor. Element type |
names and attribute names that belong to a namespace will be |
expanded; unprefixed attribute names are never expanded; unprefixed |
element type names are expanded only if there is a default |
namespace. The expanded name is the concatenation of the namespace |
URI, the namespace separator character, and the local part of the |
name. If the namespace separator is '\0' then the namespace URI |
and the local part will be concatenated without any separator. |
It is a programming error to use the separator '\0' with namespace |
triplets (see XML_SetReturnNSTriplet). |
*/ |
XMLPARSEAPI(XML_Parser) |
XML_ParserCreateNS(const XML_Char *encoding, XML_Char namespaceSeparator); |
/* Constructs a new parser using the memory management suite referred to |
by memsuite. If memsuite is NULL, then use the standard library memory |
suite. If namespaceSeparator is non-NULL it creates a parser with |
namespace processing as described above. The character pointed at |
will serve as the namespace separator. |
All further memory operations used for the created parser will come from |
the given suite. |
*/ |
XMLPARSEAPI(XML_Parser) |
XML_ParserCreate_MM(const XML_Char *encoding, |
const XML_Memory_Handling_Suite *memsuite, |
const XML_Char *namespaceSeparator); |
/* Prepare a parser object to be re-used. This is particularly |
valuable when memory allocation overhead is disproportionatly high, |
such as when a large number of small documnents need to be parsed. |
All handlers are cleared from the parser, except for the |
unknownEncodingHandler. The parser's external state is re-initialized |
except for the values of ns and ns_triplets. |
Added in Expat 1.95.3. |
*/ |
XMLPARSEAPI(XML_Bool) |
XML_ParserReset(XML_Parser parser, const XML_Char *encoding); |
/* atts is array of name/value pairs, terminated by 0; |
names and values are 0 terminated. |
*/ |
typedef void (XMLCALL *XML_StartElementHandler) (void *userData, |
const XML_Char *name, |
const XML_Char **atts); |
typedef void (XMLCALL *XML_EndElementHandler) (void *userData, |
const XML_Char *name); |
/* s is not 0 terminated. */ |
typedef void (XMLCALL *XML_CharacterDataHandler) (void *userData, |
const XML_Char *s, |
int len); |
/* target and data are 0 terminated */ |
typedef void (XMLCALL *XML_ProcessingInstructionHandler) ( |
void *userData, |
const XML_Char *target, |
const XML_Char *data); |
/* data is 0 terminated */ |
typedef void (XMLCALL *XML_CommentHandler) (void *userData, |
const XML_Char *data); |
typedef void (XMLCALL *XML_StartCdataSectionHandler) (void *userData); |
typedef void (XMLCALL *XML_EndCdataSectionHandler) (void *userData); |
/* This is called for any characters in the XML document for which |
there is no applicable handler. This includes both characters that |
are part of markup which is of a kind that is not reported |
(comments, markup declarations), or characters that are part of a |
construct which could be reported but for which no handler has been |
supplied. The characters are passed exactly as they were in the XML |
document except that they will be encoded in UTF-8 or UTF-16. |
Line boundaries are not normalized. Note that a byte order mark |
character is not passed to the default handler. There are no |
guarantees about how characters are divided between calls to the |
default handler: for example, a comment might be split between |
multiple calls. |
*/ |
typedef void (XMLCALL *XML_DefaultHandler) (void *userData, |
const XML_Char *s, |
int len); |
/* This is called for the start of the DOCTYPE declaration, before |
any DTD or internal subset is parsed. |
*/ |
typedef void (XMLCALL *XML_StartDoctypeDeclHandler) ( |
void *userData, |
const XML_Char *doctypeName, |
const XML_Char *sysid, |
const XML_Char *pubid, |
int has_internal_subset); |
/* This is called for the start of the DOCTYPE declaration when the |
closing > is encountered, but after processing any external |
subset. |
*/ |
typedef void (XMLCALL *XML_EndDoctypeDeclHandler)(void *userData); |
/* This is called for entity declarations. The is_parameter_entity |
argument will be non-zero if the entity is a parameter entity, zero |
otherwise. |
For internal entities (<!ENTITY foo "bar">), value will |
be non-NULL and systemId, publicID, and notationName will be NULL. |
The value string is NOT nul-terminated; the length is provided in |
the value_length argument. Since it is legal to have zero-length |
values, do not use this argument to test for internal entities. |
For external entities, value will be NULL and systemId will be |
non-NULL. The publicId argument will be NULL unless a public |
identifier was provided. The notationName argument will have a |
non-NULL value only for unparsed entity declarations. |
Note that is_parameter_entity can't be changed to XML_Bool, since |
that would break binary compatibility. |
*/ |
typedef void (XMLCALL *XML_EntityDeclHandler) ( |
void *userData, |
const XML_Char *entityName, |
int is_parameter_entity, |
const XML_Char *value, |
int value_length, |
const XML_Char *base, |
const XML_Char *systemId, |
const XML_Char *publicId, |
const XML_Char *notationName); |
XMLPARSEAPI(void) |
XML_SetEntityDeclHandler(XML_Parser parser, |
XML_EntityDeclHandler handler); |
/* OBSOLETE -- OBSOLETE -- OBSOLETE |
This handler has been superceded by the EntityDeclHandler above. |
It is provided here for backward compatibility. |
This is called for a declaration of an unparsed (NDATA) entity. |
The base argument is whatever was set by XML_SetBase. The |
entityName, systemId and notationName arguments will never be |
NULL. The other arguments may be. |
*/ |
typedef void (XMLCALL *XML_UnparsedEntityDeclHandler) ( |
void *userData, |
const XML_Char *entityName, |
const XML_Char *base, |
const XML_Char *systemId, |
const XML_Char *publicId, |
const XML_Char *notationName); |
/* This is called for a declaration of notation. The base argument is |
whatever was set by XML_SetBase. The notationName will never be |
NULL. The other arguments can be. |
*/ |
typedef void (XMLCALL *XML_NotationDeclHandler) ( |
void *userData, |
const XML_Char *notationName, |
const XML_Char *base, |
const XML_Char *systemId, |
const XML_Char *publicId); |
/* When namespace processing is enabled, these are called once for |
each namespace declaration. The call to the start and end element |
handlers occur between the calls to the start and end namespace |
declaration handlers. For an xmlns attribute, prefix will be |
NULL. For an xmlns="" attribute, uri will be NULL. |
*/ |
typedef void (XMLCALL *XML_StartNamespaceDeclHandler) ( |
void *userData, |
const XML_Char *prefix, |
const XML_Char *uri); |
typedef void (XMLCALL *XML_EndNamespaceDeclHandler) ( |
void *userData, |
const XML_Char *prefix); |
/* This is called if the document is not standalone, that is, it has an |
external subset or a reference to a parameter entity, but does not |
have standalone="yes". If this handler returns XML_STATUS_ERROR, |
then processing will not continue, and the parser will return a |
XML_ERROR_NOT_STANDALONE error. |
If parameter entity parsing is enabled, then in addition to the |
conditions above this handler will only be called if the referenced |
entity was actually read. |
*/ |
typedef int (XMLCALL *XML_NotStandaloneHandler) (void *userData); |
/* This is called for a reference to an external parsed general |
entity. The referenced entity is not automatically parsed. The |
application can parse it immediately or later using |
XML_ExternalEntityParserCreate. |
The parser argument is the parser parsing the entity containing the |
reference; it can be passed as the parser argument to |
XML_ExternalEntityParserCreate. The systemId argument is the |
system identifier as specified in the entity declaration; it will |
not be NULL. |
The base argument is the system identifier that should be used as |
the base for resolving systemId if systemId was relative; this is |
set by XML_SetBase; it may be NULL. |
The publicId argument is the public identifier as specified in the |
entity declaration, or NULL if none was specified; the whitespace |
in the public identifier will have been normalized as required by |
the XML spec. |
The context argument specifies the parsing context in the format |
expected by the context argument to XML_ExternalEntityParserCreate; |
context is valid only until the handler returns, so if the |
referenced entity is to be parsed later, it must be copied. |
context is NULL only when the entity is a parameter entity. |
The handler should return XML_STATUS_ERROR if processing should not |
continue because of a fatal error in the handling of the external |
entity. In this case the calling parser will return an |
XML_ERROR_EXTERNAL_ENTITY_HANDLING error. |
Note that unlike other handlers the first argument is the parser, |
not userData. |
*/ |
typedef int (XMLCALL *XML_ExternalEntityRefHandler) ( |
XML_Parser parser, |
const XML_Char *context, |
const XML_Char *base, |
const XML_Char *systemId, |
const XML_Char *publicId); |
/* This is called in two situations: |
1) An entity reference is encountered for which no declaration |
has been read *and* this is not an error. |
2) An internal entity reference is read, but not expanded, because |
XML_SetDefaultHandler has been called. |
Note: skipped parameter entities in declarations and skipped general |
entities in attribute values cannot be reported, because |
the event would be out of sync with the reporting of the |
declarations or attribute values |
*/ |
typedef void (XMLCALL *XML_SkippedEntityHandler) ( |
void *userData, |
const XML_Char *entityName, |
int is_parameter_entity); |
/* This structure is filled in by the XML_UnknownEncodingHandler to |
provide information to the parser about encodings that are unknown |
to the parser. |
The map[b] member gives information about byte sequences whose |
first byte is b. |
If map[b] is c where c is >= 0, then b by itself encodes the |
Unicode scalar value c. |
If map[b] is -1, then the byte sequence is malformed. |
If map[b] is -n, where n >= 2, then b is the first byte of an |
n-byte sequence that encodes a single Unicode scalar value. |
The data member will be passed as the first argument to the convert |
function. |
The convert function is used to convert multibyte sequences; s will |
point to a n-byte sequence where map[(unsigned char)*s] == -n. The |
convert function must return the Unicode scalar value represented |
by this byte sequence or -1 if the byte sequence is malformed. |
The convert function may be NULL if the encoding is a single-byte |
encoding, that is if map[b] >= -1 for all bytes b. |
When the parser is finished with the encoding, then if release is |
not NULL, it will call release passing it the data member; once |
release has been called, the convert function will not be called |
again. |
Expat places certain restrictions on the encodings that are supported |
using this mechanism. |
1. Every ASCII character that can appear in a well-formed XML document, |
other than the characters |
$@\^`{}~ |
must be represented by a single byte, and that byte must be the |
same byte that represents that character in ASCII. |
2. No character may require more than 4 bytes to encode. |
3. All characters encoded must have Unicode scalar values <= |
0xFFFF, (i.e., characters that would be encoded by surrogates in |
UTF-16 are not allowed). Note that this restriction doesn't |
apply to the built-in support for UTF-8 and UTF-16. |
4. No Unicode character may be encoded by more than one distinct |
sequence of bytes. |
*/ |
typedef struct { |
int map[256]; |
void *data; |
int (XMLCALL *convert)(void *data, const char *s); |
void (XMLCALL *release)(void *data); |
} XML_Encoding; |
/* This is called for an encoding that is unknown to the parser. |
The encodingHandlerData argument is that which was passed as the |
second argument to XML_SetUnknownEncodingHandler. |
The name argument gives the name of the encoding as specified in |
the encoding declaration. |
If the callback can provide information about the encoding, it must |
fill in the XML_Encoding structure, and return XML_STATUS_OK. |
Otherwise it must return XML_STATUS_ERROR. |
If info does not describe a suitable encoding, then the parser will |
return an XML_UNKNOWN_ENCODING error. |
*/ |
typedef int (XMLCALL *XML_UnknownEncodingHandler) ( |
void *encodingHandlerData, |
const XML_Char *name, |
XML_Encoding *info); |
XMLPARSEAPI(void) |
XML_SetElementHandler(XML_Parser parser, |
XML_StartElementHandler start, |
XML_EndElementHandler end); |
XMLPARSEAPI(void) |
XML_SetStartElementHandler(XML_Parser parser, |
XML_StartElementHandler handler); |
XMLPARSEAPI(void) |
XML_SetEndElementHandler(XML_Parser parser, |
XML_EndElementHandler handler); |
XMLPARSEAPI(void) |
XML_SetCharacterDataHandler(XML_Parser parser, |
XML_CharacterDataHandler handler); |
XMLPARSEAPI(void) |
XML_SetProcessingInstructionHandler(XML_Parser parser, |
XML_ProcessingInstructionHandler handler); |
XMLPARSEAPI(void) |
XML_SetCommentHandler(XML_Parser parser, |
XML_CommentHandler handler); |
XMLPARSEAPI(void) |
XML_SetCdataSectionHandler(XML_Parser parser, |
XML_StartCdataSectionHandler start, |
XML_EndCdataSectionHandler end); |
XMLPARSEAPI(void) |
XML_SetStartCdataSectionHandler(XML_Parser parser, |
XML_StartCdataSectionHandler start); |
XMLPARSEAPI(void) |
XML_SetEndCdataSectionHandler(XML_Parser parser, |
XML_EndCdataSectionHandler end); |
/* This sets the default handler and also inhibits expansion of |
internal entities. These entity references will be passed to the |
default handler, or to the skipped entity handler, if one is set. |
*/ |
XMLPARSEAPI(void) |
XML_SetDefaultHandler(XML_Parser parser, |
XML_DefaultHandler handler); |
/* This sets the default handler but does not inhibit expansion of |
internal entities. The entity reference will not be passed to the |
default handler. |
*/ |
XMLPARSEAPI(void) |
XML_SetDefaultHandlerExpand(XML_Parser parser, |
XML_DefaultHandler handler); |
XMLPARSEAPI(void) |
XML_SetDoctypeDeclHandler(XML_Parser parser, |
XML_StartDoctypeDeclHandler start, |
XML_EndDoctypeDeclHandler end); |
XMLPARSEAPI(void) |
XML_SetStartDoctypeDeclHandler(XML_Parser parser, |
XML_StartDoctypeDeclHandler start); |
XMLPARSEAPI(void) |
XML_SetEndDoctypeDeclHandler(XML_Parser parser, |
XML_EndDoctypeDeclHandler end); |
XMLPARSEAPI(void) |
XML_SetUnparsedEntityDeclHandler(XML_Parser parser, |
XML_UnparsedEntityDeclHandler handler); |
XMLPARSEAPI(void) |
XML_SetNotationDeclHandler(XML_Parser parser, |
XML_NotationDeclHandler handler); |
XMLPARSEAPI(void) |
XML_SetNamespaceDeclHandler(XML_Parser parser, |
XML_StartNamespaceDeclHandler start, |
XML_EndNamespaceDeclHandler end); |
XMLPARSEAPI(void) |
XML_SetStartNamespaceDeclHandler(XML_Parser parser, |
XML_StartNamespaceDeclHandler start); |
XMLPARSEAPI(void) |
XML_SetEndNamespaceDeclHandler(XML_Parser parser, |
XML_EndNamespaceDeclHandler end); |
XMLPARSEAPI(void) |
XML_SetNotStandaloneHandler(XML_Parser parser, |
XML_NotStandaloneHandler handler); |
XMLPARSEAPI(void) |
XML_SetExternalEntityRefHandler(XML_Parser parser, |
XML_ExternalEntityRefHandler handler); |
/* If a non-NULL value for arg is specified here, then it will be |
passed as the first argument to the external entity ref handler |
instead of the parser object. |
*/ |
XMLPARSEAPI(void) |
XML_SetExternalEntityRefHandlerArg(XML_Parser parser, |
void *arg); |
XMLPARSEAPI(void) |
XML_SetSkippedEntityHandler(XML_Parser parser, |
XML_SkippedEntityHandler handler); |
XMLPARSEAPI(void) |
XML_SetUnknownEncodingHandler(XML_Parser parser, |
XML_UnknownEncodingHandler handler, |
void *encodingHandlerData); |
/* This can be called within a handler for a start element, end |
element, processing instruction or character data. It causes the |
corresponding markup to be passed to the default handler. |
*/ |
XMLPARSEAPI(void) |
XML_DefaultCurrent(XML_Parser parser); |
/* If do_nst is non-zero, and namespace processing is in effect, and |
a name has a prefix (i.e. an explicit namespace qualifier) then |
that name is returned as a triplet in a single string separated by |
the separator character specified when the parser was created: URI |
+ sep + local_name + sep + prefix. |
If do_nst is zero, then namespace information is returned in the |
default manner (URI + sep + local_name) whether or not the name |
has a prefix. |
Note: Calling XML_SetReturnNSTriplet after XML_Parse or |
XML_ParseBuffer has no effect. |
*/ |
XMLPARSEAPI(void) |
XML_SetReturnNSTriplet(XML_Parser parser, int do_nst); |
/* This value is passed as the userData argument to callbacks. */ |
XMLPARSEAPI(void) |
XML_SetUserData(XML_Parser parser, void *userData); |
/* Returns the last value set by XML_SetUserData or NULL. */ |
#define XML_GetUserData(parser) (*(void **)(parser)) |
/* This is equivalent to supplying an encoding argument to |
XML_ParserCreate. On success XML_SetEncoding returns non-zero, |
zero otherwise. |
Note: Calling XML_SetEncoding after XML_Parse or XML_ParseBuffer |
has no effect and returns XML_STATUS_ERROR. |
*/ |
XMLPARSEAPI(enum XML_Status) |
XML_SetEncoding(XML_Parser parser, const XML_Char *encoding); |
/* If this function is called, then the parser will be passed as the |
first argument to callbacks instead of userData. The userData will |
still be accessible using XML_GetUserData. |
*/ |
XMLPARSEAPI(void) |
XML_UseParserAsHandlerArg(XML_Parser parser); |
/* If useDTD == XML_TRUE is passed to this function, then the parser |
will assume that there is an external subset, even if none is |
specified in the document. In such a case the parser will call the |
externalEntityRefHandler with a value of NULL for the systemId |
argument (the publicId and context arguments will be NULL as well). |
Note: For the purpose of checking WFC: Entity Declared, passing |
useDTD == XML_TRUE will make the parser behave as if the document |
had a DTD with an external subset. |
Note: If this function is called, then this must be done before |
the first call to XML_Parse or XML_ParseBuffer, since it will |
have no effect after that. Returns |
XML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSING. |
Note: If the document does not have a DOCTYPE declaration at all, |
then startDoctypeDeclHandler and endDoctypeDeclHandler will not |
be called, despite an external subset being parsed. |
Note: If XML_DTD is not defined when Expat is compiled, returns |
XML_ERROR_FEATURE_REQUIRES_XML_DTD. |
*/ |
XMLPARSEAPI(enum XML_Error) |
XML_UseForeignDTD(XML_Parser parser, XML_Bool useDTD); |
/* Sets the base to be used for resolving relative URIs in system |
identifiers in declarations. Resolving relative identifiers is |
left to the application: this value will be passed through as the |
base argument to the XML_ExternalEntityRefHandler, |
XML_NotationDeclHandler and XML_UnparsedEntityDeclHandler. The base |
argument will be copied. Returns XML_STATUS_ERROR if out of memory, |
XML_STATUS_OK otherwise. |
*/ |
XMLPARSEAPI(enum XML_Status) |
XML_SetBase(XML_Parser parser, const XML_Char *base); |
XMLPARSEAPI(const XML_Char *) |
XML_GetBase(XML_Parser parser); |
/* Returns the number of the attribute/value pairs passed in last call |
to the XML_StartElementHandler that were specified in the start-tag |
rather than defaulted. Each attribute/value pair counts as 2; thus |
this correspondds to an index into the atts array passed to the |
XML_StartElementHandler. |
*/ |
XMLPARSEAPI(int) |
XML_GetSpecifiedAttributeCount(XML_Parser parser); |
/* Returns the index of the ID attribute passed in the last call to |
XML_StartElementHandler, or -1 if there is no ID attribute. Each |
attribute/value pair counts as 2; thus this correspondds to an |
index into the atts array passed to the XML_StartElementHandler. |
*/ |
XMLPARSEAPI(int) |
XML_GetIdAttributeIndex(XML_Parser parser); |
#ifdef XML_ATTR_INFO |
/* Source file byte offsets for the start and end of attribute names and values. |
The value indices are exclusive of surrounding quotes; thus in a UTF-8 source |
file an attribute value of "blah" will yield: |
info->valueEnd - info->valueStart = 4 bytes. |
*/ |
typedef struct { |
XML_Index nameStart; /* Offset to beginning of the attribute name. */ |
XML_Index nameEnd; /* Offset after the attribute name's last byte. */ |
XML_Index valueStart; /* Offset to beginning of the attribute value. */ |
XML_Index valueEnd; /* Offset after the attribute value's last byte. */ |
} XML_AttrInfo; |
/* Returns an array of XML_AttrInfo structures for the attribute/value pairs |
passed in last call to the XML_StartElementHandler that were specified |
in the start-tag rather than defaulted. Each attribute/value pair counts |
as 1; thus the number of entries in the array is |
XML_GetSpecifiedAttributeCount(parser) / 2. |
*/ |
XMLPARSEAPI(const XML_AttrInfo *) |
XML_GetAttributeInfo(XML_Parser parser); |
#endif |
/* Parses some input. Returns XML_STATUS_ERROR if a fatal error is |
detected. The last call to XML_Parse must have isFinal true; len |
may be zero for this call (or any other). |
Though the return values for these functions has always been |
described as a Boolean value, the implementation, at least for the |
1.95.x series, has always returned exactly one of the XML_Status |
values. |
*/ |
XMLPARSEAPI(enum XML_Status) |
XML_Parse(XML_Parser parser, const char *s, int len, int isFinal); |
XMLPARSEAPI(void *) |
XML_GetBuffer(XML_Parser parser, int len); |
XMLPARSEAPI(enum XML_Status) |
XML_ParseBuffer(XML_Parser parser, int len, int isFinal); |
/* Stops parsing, causing XML_Parse() or XML_ParseBuffer() to return. |
Must be called from within a call-back handler, except when aborting |
(resumable = 0) an already suspended parser. Some call-backs may |
still follow because they would otherwise get lost. Examples: |
- endElementHandler() for empty elements when stopped in |
startElementHandler(), |
- endNameSpaceDeclHandler() when stopped in endElementHandler(), |
and possibly others. |
Can be called from most handlers, including DTD related call-backs, |
except when parsing an external parameter entity and resumable != 0. |
Returns XML_STATUS_OK when successful, XML_STATUS_ERROR otherwise. |
Possible error codes: |
- XML_ERROR_SUSPENDED: when suspending an already suspended parser. |
- XML_ERROR_FINISHED: when the parser has already finished. |
- XML_ERROR_SUSPEND_PE: when suspending while parsing an external PE. |
When resumable != 0 (true) then parsing is suspended, that is, |
XML_Parse() and XML_ParseBuffer() return XML_STATUS_SUSPENDED. |
Otherwise, parsing is aborted, that is, XML_Parse() and XML_ParseBuffer() |
return XML_STATUS_ERROR with error code XML_ERROR_ABORTED. |
*Note*: |
This will be applied to the current parser instance only, that is, if |
there is a parent parser then it will continue parsing when the |
externalEntityRefHandler() returns. It is up to the implementation of |
the externalEntityRefHandler() to call XML_StopParser() on the parent |
parser (recursively), if one wants to stop parsing altogether. |
When suspended, parsing can be resumed by calling XML_ResumeParser(). |
*/ |
XMLPARSEAPI(enum XML_Status) |
XML_StopParser(XML_Parser parser, XML_Bool resumable); |
/* Resumes parsing after it has been suspended with XML_StopParser(). |
Must not be called from within a handler call-back. Returns same |
status codes as XML_Parse() or XML_ParseBuffer(). |
Additional error code XML_ERROR_NOT_SUSPENDED possible. |
*Note*: |
This must be called on the most deeply nested child parser instance |
first, and on its parent parser only after the child parser has finished, |
to be applied recursively until the document entity's parser is restarted. |
That is, the parent parser will not resume by itself and it is up to the |
application to call XML_ResumeParser() on it at the appropriate moment. |
*/ |
XMLPARSEAPI(enum XML_Status) |
XML_ResumeParser(XML_Parser parser); |
enum XML_Parsing { |
XML_INITIALIZED, |
XML_PARSING, |
XML_FINISHED, |
XML_SUSPENDED |
}; |
typedef struct { |
enum XML_Parsing parsing; |
XML_Bool finalBuffer; |
} XML_ParsingStatus; |
/* Returns status of parser with respect to being initialized, parsing, |
finished, or suspended and processing the final buffer. |
XXX XML_Parse() and XML_ParseBuffer() should return XML_ParsingStatus, |
XXX with XML_FINISHED_OK or XML_FINISHED_ERROR replacing XML_FINISHED |
*/ |
XMLPARSEAPI(void) |
XML_GetParsingStatus(XML_Parser parser, XML_ParsingStatus *status); |
/* Creates an XML_Parser object that can parse an external general |
entity; context is a '\0'-terminated string specifying the parse |
context; encoding is a '\0'-terminated string giving the name of |
the externally specified encoding, or NULL if there is no |
externally specified encoding. The context string consists of a |
sequence of tokens separated by formfeeds (\f); a token consisting |
of a name specifies that the general entity of the name is open; a |
token of the form prefix=uri specifies the namespace for a |
particular prefix; a token of the form =uri specifies the default |
namespace. This can be called at any point after the first call to |
an ExternalEntityRefHandler so longer as the parser has not yet |
been freed. The new parser is completely independent and may |
safely be used in a separate thread. The handlers and userData are |
initialized from the parser argument. Returns NULL if out of memory. |
Otherwise returns a new XML_Parser object. |
*/ |
XMLPARSEAPI(XML_Parser) |
XML_ExternalEntityParserCreate(XML_Parser parser, |
const XML_Char *context, |
const XML_Char *encoding); |
enum XML_ParamEntityParsing { |
XML_PARAM_ENTITY_PARSING_NEVER, |
XML_PARAM_ENTITY_PARSING_UNLESS_STANDALONE, |
XML_PARAM_ENTITY_PARSING_ALWAYS |
}; |
/* Controls parsing of parameter entities (including the external DTD |
subset). If parsing of parameter entities is enabled, then |
references to external parameter entities (including the external |
DTD subset) will be passed to the handler set with |
XML_SetExternalEntityRefHandler. The context passed will be 0. |
Unlike external general entities, external parameter entities can |
only be parsed synchronously. If the external parameter entity is |
to be parsed, it must be parsed during the call to the external |
entity ref handler: the complete sequence of |
XML_ExternalEntityParserCreate, XML_Parse/XML_ParseBuffer and |
XML_ParserFree calls must be made during this call. After |
XML_ExternalEntityParserCreate has been called to create the parser |
for the external parameter entity (context must be 0 for this |
call), it is illegal to make any calls on the old parser until |
XML_ParserFree has been called on the newly created parser. |
If the library has been compiled without support for parameter |
entity parsing (ie without XML_DTD being defined), then |
XML_SetParamEntityParsing will return 0 if parsing of parameter |
entities is requested; otherwise it will return non-zero. |
Note: If XML_SetParamEntityParsing is called after XML_Parse or |
XML_ParseBuffer, then it has no effect and will always return 0. |
*/ |
XMLPARSEAPI(int) |
XML_SetParamEntityParsing(XML_Parser parser, |
enum XML_ParamEntityParsing parsing); |
/* Sets the hash salt to use for internal hash calculations. |
Helps in preventing DoS attacks based on predicting hash |
function behavior. This must be called before parsing is started. |
Returns 1 if successful, 0 when called after parsing has started. |
*/ |
XMLPARSEAPI(int) |
XML_SetHashSalt(XML_Parser parser, |
unsigned long hash_salt); |
/* If XML_Parse or XML_ParseBuffer have returned XML_STATUS_ERROR, then |
XML_GetErrorCode returns information about the error. |
*/ |
XMLPARSEAPI(enum XML_Error) |
XML_GetErrorCode(XML_Parser parser); |
/* These functions return information about the current parse |
location. They may be called from any callback called to report |
some parse event; in this case the location is the location of the |
first of the sequence of characters that generated the event. When |
called from callbacks generated by declarations in the document |
prologue, the location identified isn't as neatly defined, but will |
be within the relevant markup. When called outside of the callback |
functions, the position indicated will be just past the last parse |
event (regardless of whether there was an associated callback). |
They may also be called after returning from a call to XML_Parse |
or XML_ParseBuffer. If the return value is XML_STATUS_ERROR then |
the location is the location of the character at which the error |
was detected; otherwise the location is the location of the last |
parse event, as described above. |
*/ |
XMLPARSEAPI(XML_Size) XML_GetCurrentLineNumber(XML_Parser parser); |
XMLPARSEAPI(XML_Size) XML_GetCurrentColumnNumber(XML_Parser parser); |
XMLPARSEAPI(XML_Index) XML_GetCurrentByteIndex(XML_Parser parser); |
/* Return the number of bytes in the current event. |
Returns 0 if the event is in an internal entity. |
*/ |
XMLPARSEAPI(int) |
XML_GetCurrentByteCount(XML_Parser parser); |
/* If XML_CONTEXT_BYTES is defined, returns the input buffer, sets |
the integer pointed to by offset to the offset within this buffer |
of the current parse position, and sets the integer pointed to by size |
to the size of this buffer (the number of input bytes). Otherwise |
returns a NULL pointer. Also returns a NULL pointer if a parse isn't |
active. |
NOTE: The character pointer returned should not be used outside |
the handler that makes the call. |
*/ |
XMLPARSEAPI(const char *) |
XML_GetInputContext(XML_Parser parser, |
int *offset, |
int *size); |
/* For backwards compatibility with previous versions. */ |
#define XML_GetErrorLineNumber XML_GetCurrentLineNumber |
#define XML_GetErrorColumnNumber XML_GetCurrentColumnNumber |
#define XML_GetErrorByteIndex XML_GetCurrentByteIndex |
/* Frees the content model passed to the element declaration handler */ |
XMLPARSEAPI(void) |
XML_FreeContentModel(XML_Parser parser, XML_Content *model); |
/* Exposing the memory handling functions used in Expat */ |
XMLPARSEAPI(void *) |
XML_MemMalloc(XML_Parser parser, size_t size); |
XMLPARSEAPI(void *) |
XML_MemRealloc(XML_Parser parser, void *ptr, size_t size); |
XMLPARSEAPI(void) |
XML_MemFree(XML_Parser parser, void *ptr); |
/* Frees memory used by the parser. */ |
XMLPARSEAPI(void) |
XML_ParserFree(XML_Parser parser); |
/* Returns a string describing the error. */ |
XMLPARSEAPI(const XML_LChar *) |
XML_ErrorString(enum XML_Error code); |
/* Return a string containing the version number of this expat */ |
XMLPARSEAPI(const XML_LChar *) |
XML_ExpatVersion(void); |
typedef struct { |
int major; |
int minor; |
int micro; |
} XML_Expat_Version; |
/* Return an XML_Expat_Version structure containing numeric version |
number information for this version of expat. |
*/ |
XMLPARSEAPI(XML_Expat_Version) |
XML_ExpatVersionInfo(void); |
/* Added in Expat 1.95.5. */ |
enum XML_FeatureEnum { |
XML_FEATURE_END = 0, |
XML_FEATURE_UNICODE, |
XML_FEATURE_UNICODE_WCHAR_T, |
XML_FEATURE_DTD, |
XML_FEATURE_CONTEXT_BYTES, |
XML_FEATURE_MIN_SIZE, |
XML_FEATURE_SIZEOF_XML_CHAR, |
XML_FEATURE_SIZEOF_XML_LCHAR, |
XML_FEATURE_NS, |
XML_FEATURE_LARGE_SIZE, |
XML_FEATURE_ATTR_INFO |
/* Additional features must be added to the end of this enum. */ |
}; |
typedef struct { |
enum XML_FeatureEnum feature; |
const XML_LChar *name; |
long int value; |
} XML_Feature; |
XMLPARSEAPI(const XML_Feature *) |
XML_GetFeatureList(void); |
/* Expat follows the GNU/Linux convention of odd number minor version for |
beta/development releases and even number minor version for stable |
releases. Micro is bumped with each release, and set to 0 with each |
change to major or minor version. |
*/ |
#define XML_MAJOR_VERSION 2 |
#define XML_MINOR_VERSION 1 |
#define XML_MICRO_VERSION 0 |
#ifdef __cplusplus |
} |
#endif |
#endif /* not Expat_INCLUDED */ |
/contrib/sdk/sources/expat/lib/expat_external.h |
---|
0,0 → 1,115 |
/* Copyright (c) 1998, 1999, 2000 Thai Open Source Software Center Ltd |
See the file COPYING for copying permission. |
*/ |
#ifndef Expat_External_INCLUDED |
#define Expat_External_INCLUDED 1 |
/* External API definitions */ |
#if defined(_MSC_EXTENSIONS) && !defined(__BEOS__) && !defined(__CYGWIN__) |
#define XML_USE_MSC_EXTENSIONS 1 |
#endif |
/* Expat tries very hard to make the API boundary very specifically |
defined. There are two macros defined to control this boundary; |
each of these can be defined before including this header to |
achieve some different behavior, but doing so it not recommended or |
tested frequently. |
XMLCALL - The calling convention to use for all calls across the |
"library boundary." This will default to cdecl, and |
try really hard to tell the compiler that's what we |
want. |
XMLIMPORT - Whatever magic is needed to note that a function is |
to be imported from a dynamically loaded library |
(.dll, .so, or .sl, depending on your platform). |
The XMLCALL macro was added in Expat 1.95.7. The only one which is |
expected to be directly useful in client code is XMLCALL. |
Note that on at least some Unix versions, the Expat library must be |
compiled with the cdecl calling convention as the default since |
system headers may assume the cdecl convention. |
*/ |
#ifndef XMLCALL |
#if defined(_MSC_VER) |
#define XMLCALL __cdecl |
#elif defined(__GNUC__) && defined(__i386) && !defined(__INTEL_COMPILER) |
#define XMLCALL __attribute__((cdecl)) |
#else |
/* For any platform which uses this definition and supports more than |
one calling convention, we need to extend this definition to |
declare the convention used on that platform, if it's possible to |
do so. |
If this is the case for your platform, please file a bug report |
with information on how to identify your platform via the C |
pre-processor and how to specify the same calling convention as the |
platform's malloc() implementation. |
*/ |
#define XMLCALL |
#endif |
#endif /* not defined XMLCALL */ |
#if !defined(XML_STATIC) && !defined(XMLIMPORT) |
#ifndef XML_BUILDING_EXPAT |
/* using Expat from an application */ |
#ifdef XML_USE_MSC_EXTENSIONS |
#define XMLIMPORT __declspec(dllimport) |
#endif |
#endif |
#endif /* not defined XML_STATIC */ |
/* If we didn't define it above, define it away: */ |
#ifndef XMLIMPORT |
#define XMLIMPORT |
#endif |
#define XMLPARSEAPI(type) XMLIMPORT type XMLCALL |
#ifdef __cplusplus |
extern "C" { |
#endif |
#ifdef XML_UNICODE_WCHAR_T |
#define XML_UNICODE |
#endif |
#ifdef XML_UNICODE /* Information is UTF-16 encoded. */ |
#ifdef XML_UNICODE_WCHAR_T |
typedef wchar_t XML_Char; |
typedef wchar_t XML_LChar; |
#else |
typedef unsigned short XML_Char; |
typedef char XML_LChar; |
#endif /* XML_UNICODE_WCHAR_T */ |
#else /* Information is UTF-8 encoded. */ |
typedef char XML_Char; |
typedef char XML_LChar; |
#endif /* XML_UNICODE */ |
#ifdef XML_LARGE_SIZE /* Use large integers for file/stream positions. */ |
#if defined(XML_USE_MSC_EXTENSIONS) && _MSC_VER < 1400 |
typedef __int64 XML_Index; |
typedef unsigned __int64 XML_Size; |
#else |
typedef long long XML_Index; |
typedef unsigned long long XML_Size; |
#endif |
#else |
typedef long XML_Index; |
typedef unsigned long XML_Size; |
#endif /* XML_LARGE_SIZE */ |
#ifdef __cplusplus |
} |
#endif |
#endif /* not Expat_External_INCLUDED */ |
/contrib/sdk/sources/expat/lib/iasciitab.h |
---|
0,0 → 1,37 |
/* Copyright (c) 1998, 1999 Thai Open Source Software Center Ltd |
See the file COPYING for copying permission. |
*/ |
/* Like asciitab.h, except that 0xD has code BT_S rather than BT_CR */ |
/* 0x00 */ BT_NONXML, BT_NONXML, BT_NONXML, BT_NONXML, |
/* 0x04 */ BT_NONXML, BT_NONXML, BT_NONXML, BT_NONXML, |
/* 0x08 */ BT_NONXML, BT_S, BT_LF, BT_NONXML, |
/* 0x0C */ BT_NONXML, BT_S, BT_NONXML, BT_NONXML, |
/* 0x10 */ BT_NONXML, BT_NONXML, BT_NONXML, BT_NONXML, |
/* 0x14 */ BT_NONXML, BT_NONXML, BT_NONXML, BT_NONXML, |
/* 0x18 */ BT_NONXML, BT_NONXML, BT_NONXML, BT_NONXML, |
/* 0x1C */ BT_NONXML, BT_NONXML, BT_NONXML, BT_NONXML, |
/* 0x20 */ BT_S, BT_EXCL, BT_QUOT, BT_NUM, |
/* 0x24 */ BT_OTHER, BT_PERCNT, BT_AMP, BT_APOS, |
/* 0x28 */ BT_LPAR, BT_RPAR, BT_AST, BT_PLUS, |
/* 0x2C */ BT_COMMA, BT_MINUS, BT_NAME, BT_SOL, |
/* 0x30 */ BT_DIGIT, BT_DIGIT, BT_DIGIT, BT_DIGIT, |
/* 0x34 */ BT_DIGIT, BT_DIGIT, BT_DIGIT, BT_DIGIT, |
/* 0x38 */ BT_DIGIT, BT_DIGIT, BT_COLON, BT_SEMI, |
/* 0x3C */ BT_LT, BT_EQUALS, BT_GT, BT_QUEST, |
/* 0x40 */ BT_OTHER, BT_HEX, BT_HEX, BT_HEX, |
/* 0x44 */ BT_HEX, BT_HEX, BT_HEX, BT_NMSTRT, |
/* 0x48 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0x4C */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0x50 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0x54 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0x58 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_LSQB, |
/* 0x5C */ BT_OTHER, BT_RSQB, BT_OTHER, BT_NMSTRT, |
/* 0x60 */ BT_OTHER, BT_HEX, BT_HEX, BT_HEX, |
/* 0x64 */ BT_HEX, BT_HEX, BT_HEX, BT_NMSTRT, |
/* 0x68 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0x6C */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0x70 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0x74 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0x78 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_OTHER, |
/* 0x7C */ BT_VERBAR, BT_OTHER, BT_OTHER, BT_OTHER, |
/contrib/sdk/sources/expat/lib/internal.h |
---|
0,0 → 1,73 |
/* internal.h |
Internal definitions used by Expat. This is not needed to compile |
client code. |
The following calling convention macros are defined for frequently |
called functions: |
FASTCALL - Used for those internal functions that have a simple |
body and a low number of arguments and local variables. |
PTRCALL - Used for functions called though function pointers. |
PTRFASTCALL - Like PTRCALL, but for low number of arguments. |
inline - Used for selected internal functions for which inlining |
may improve performance on some platforms. |
Note: Use of these macros is based on judgement, not hard rules, |
and therefore subject to change. |
*/ |
#if defined(__GNUC__) && defined(__i386__) && !defined(__MINGW32__) |
/* We'll use this version by default only where we know it helps. |
regparm() generates warnings on Solaris boxes. See SF bug #692878. |
Instability reported with egcs on a RedHat Linux 7.3. |
Let's comment out: |
#define FASTCALL __attribute__((stdcall, regparm(3))) |
and let's try this: |
*/ |
#define FASTCALL __attribute__((regparm(3))) |
#define PTRFASTCALL __attribute__((regparm(3))) |
#endif |
/* Using __fastcall seems to have an unexpected negative effect under |
MS VC++, especially for function pointers, so we won't use it for |
now on that platform. It may be reconsidered for a future release |
if it can be made more effective. |
Likely reason: __fastcall on Windows is like stdcall, therefore |
the compiler cannot perform stack optimizations for call clusters. |
*/ |
/* Make sure all of these are defined if they aren't already. */ |
#ifndef FASTCALL |
#define FASTCALL |
#endif |
#ifndef PTRCALL |
#define PTRCALL |
#endif |
#ifndef PTRFASTCALL |
#define PTRFASTCALL |
#endif |
#ifndef XML_MIN_SIZE |
#if !defined(__cplusplus) && !defined(inline) |
#ifdef __GNUC__ |
#define inline __inline |
#endif /* __GNUC__ */ |
#endif |
#endif /* XML_MIN_SIZE */ |
#ifdef __cplusplus |
#define inline inline |
#else |
#ifndef inline |
#define inline |
#endif |
#endif |
/contrib/sdk/sources/expat/lib/latin1tab.h |
---|
0,0 → 1,36 |
/* Copyright (c) 1998, 1999 Thai Open Source Software Center Ltd |
See the file COPYING for copying permission. |
*/ |
/* 0x80 */ BT_OTHER, BT_OTHER, BT_OTHER, BT_OTHER, |
/* 0x84 */ BT_OTHER, BT_OTHER, BT_OTHER, BT_OTHER, |
/* 0x88 */ BT_OTHER, BT_OTHER, BT_OTHER, BT_OTHER, |
/* 0x8C */ BT_OTHER, BT_OTHER, BT_OTHER, BT_OTHER, |
/* 0x90 */ BT_OTHER, BT_OTHER, BT_OTHER, BT_OTHER, |
/* 0x94 */ BT_OTHER, BT_OTHER, BT_OTHER, BT_OTHER, |
/* 0x98 */ BT_OTHER, BT_OTHER, BT_OTHER, BT_OTHER, |
/* 0x9C */ BT_OTHER, BT_OTHER, BT_OTHER, BT_OTHER, |
/* 0xA0 */ BT_OTHER, BT_OTHER, BT_OTHER, BT_OTHER, |
/* 0xA4 */ BT_OTHER, BT_OTHER, BT_OTHER, BT_OTHER, |
/* 0xA8 */ BT_OTHER, BT_OTHER, BT_NMSTRT, BT_OTHER, |
/* 0xAC */ BT_OTHER, BT_OTHER, BT_OTHER, BT_OTHER, |
/* 0xB0 */ BT_OTHER, BT_OTHER, BT_OTHER, BT_OTHER, |
/* 0xB4 */ BT_OTHER, BT_NMSTRT, BT_OTHER, BT_NAME, |
/* 0xB8 */ BT_OTHER, BT_OTHER, BT_NMSTRT, BT_OTHER, |
/* 0xBC */ BT_OTHER, BT_OTHER, BT_OTHER, BT_OTHER, |
/* 0xC0 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0xC4 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0xC8 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0xCC */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0xD0 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0xD4 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_OTHER, |
/* 0xD8 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0xDC */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0xE0 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0xE4 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0xE8 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0xEC */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0xF0 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0xF4 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_OTHER, |
/* 0xF8 */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/* 0xFC */ BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, BT_NMSTRT, |
/contrib/sdk/sources/expat/lib/libexpat.def |
---|
0,0 → 1,73 |
; DEF file for MS VC++ |
LIBRARY |
EXPORTS |
XML_DefaultCurrent @1 |
XML_ErrorString @2 |
XML_ExpatVersion @3 |
XML_ExpatVersionInfo @4 |
XML_ExternalEntityParserCreate @5 |
XML_GetBase @6 |
XML_GetBuffer @7 |
XML_GetCurrentByteCount @8 |
XML_GetCurrentByteIndex @9 |
XML_GetCurrentColumnNumber @10 |
XML_GetCurrentLineNumber @11 |
XML_GetErrorCode @12 |
XML_GetIdAttributeIndex @13 |
XML_GetInputContext @14 |
XML_GetSpecifiedAttributeCount @15 |
XML_Parse @16 |
XML_ParseBuffer @17 |
XML_ParserCreate @18 |
XML_ParserCreateNS @19 |
XML_ParserCreate_MM @20 |
XML_ParserFree @21 |
XML_SetAttlistDeclHandler @22 |
XML_SetBase @23 |
XML_SetCdataSectionHandler @24 |
XML_SetCharacterDataHandler @25 |
XML_SetCommentHandler @26 |
XML_SetDefaultHandler @27 |
XML_SetDefaultHandlerExpand @28 |
XML_SetDoctypeDeclHandler @29 |
XML_SetElementDeclHandler @30 |
XML_SetElementHandler @31 |
XML_SetEncoding @32 |
XML_SetEndCdataSectionHandler @33 |
XML_SetEndDoctypeDeclHandler @34 |
XML_SetEndElementHandler @35 |
XML_SetEndNamespaceDeclHandler @36 |
XML_SetEntityDeclHandler @37 |
XML_SetExternalEntityRefHandler @38 |
XML_SetExternalEntityRefHandlerArg @39 |
XML_SetNamespaceDeclHandler @40 |
XML_SetNotStandaloneHandler @41 |
XML_SetNotationDeclHandler @42 |
XML_SetParamEntityParsing @43 |
XML_SetProcessingInstructionHandler @44 |
XML_SetReturnNSTriplet @45 |
XML_SetStartCdataSectionHandler @46 |
XML_SetStartDoctypeDeclHandler @47 |
XML_SetStartElementHandler @48 |
XML_SetStartNamespaceDeclHandler @49 |
XML_SetUnknownEncodingHandler @50 |
XML_SetUnparsedEntityDeclHandler @51 |
XML_SetUserData @52 |
XML_SetXmlDeclHandler @53 |
XML_UseParserAsHandlerArg @54 |
; added with version 1.95.3 |
XML_ParserReset @55 |
XML_SetSkippedEntityHandler @56 |
; added with version 1.95.5 |
XML_GetFeatureList @57 |
XML_UseForeignDTD @58 |
; added with version 1.95.6 |
XML_FreeContentModel @59 |
XML_MemMalloc @60 |
XML_MemRealloc @61 |
XML_MemFree @62 |
; added with version 1.95.8 |
XML_StopParser @63 |
XML_ResumeParser @64 |
XML_GetParsingStatus @65 |
/contrib/sdk/sources/expat/lib/libexpatw.def |
---|
0,0 → 1,73 |
; DEF file for MS VC++ |
LIBRARY |
EXPORTS |
XML_DefaultCurrent @1 |
XML_ErrorString @2 |
XML_ExpatVersion @3 |
XML_ExpatVersionInfo @4 |
XML_ExternalEntityParserCreate @5 |
XML_GetBase @6 |
XML_GetBuffer @7 |
XML_GetCurrentByteCount @8 |
XML_GetCurrentByteIndex @9 |
XML_GetCurrentColumnNumber @10 |
XML_GetCurrentLineNumber @11 |
XML_GetErrorCode @12 |
XML_GetIdAttributeIndex @13 |
XML_GetInputContext @14 |
XML_GetSpecifiedAttributeCount @15 |
XML_Parse @16 |
XML_ParseBuffer @17 |
XML_ParserCreate @18 |
XML_ParserCreateNS @19 |
XML_ParserCreate_MM @20 |
XML_ParserFree @21 |
XML_SetAttlistDeclHandler @22 |
XML_SetBase @23 |
XML_SetCdataSectionHandler @24 |
XML_SetCharacterDataHandler @25 |
XML_SetCommentHandler @26 |
XML_SetDefaultHandler @27 |
XML_SetDefaultHandlerExpand @28 |
XML_SetDoctypeDeclHandler @29 |
XML_SetElementDeclHandler @30 |
XML_SetElementHandler @31 |
XML_SetEncoding @32 |
XML_SetEndCdataSectionHandler @33 |
XML_SetEndDoctypeDeclHandler @34 |
XML_SetEndElementHandler @35 |
XML_SetEndNamespaceDeclHandler @36 |
XML_SetEntityDeclHandler @37 |
XML_SetExternalEntityRefHandler @38 |
XML_SetExternalEntityRefHandlerArg @39 |
XML_SetNamespaceDeclHandler @40 |
XML_SetNotStandaloneHandler @41 |
XML_SetNotationDeclHandler @42 |
XML_SetParamEntityParsing @43 |
XML_SetProcessingInstructionHandler @44 |
XML_SetReturnNSTriplet @45 |
XML_SetStartCdataSectionHandler @46 |
XML_SetStartDoctypeDeclHandler @47 |
XML_SetStartElementHandler @48 |
XML_SetStartNamespaceDeclHandler @49 |
XML_SetUnknownEncodingHandler @50 |
XML_SetUnparsedEntityDeclHandler @51 |
XML_SetUserData @52 |
XML_SetXmlDeclHandler @53 |
XML_UseParserAsHandlerArg @54 |
; added with version 1.95.3 |
XML_ParserReset @55 |
XML_SetSkippedEntityHandler @56 |
; added with version 1.95.5 |
XML_GetFeatureList @57 |
XML_UseForeignDTD @58 |
; added with version 1.95.6 |
XML_FreeContentModel @59 |
XML_MemMalloc @60 |
XML_MemRealloc @61 |
XML_MemFree @62 |
; added with version 1.95.8 |
XML_StopParser @63 |
XML_ResumeParser @64 |
XML_GetParsingStatus @65 |
/contrib/sdk/sources/expat/lib/macconfig.h |
---|
0,0 → 1,53 |
/*================================================================ |
** Copyright 2000, Clark Cooper |
** All rights reserved. |
** |
** This is free software. You are permitted to copy, distribute, or modify |
** it under the terms of the MIT/X license (contained in the COPYING file |
** with this distribution.) |
** |
*/ |
#ifndef MACCONFIG_H |
#define MACCONFIG_H |
/* 1234 = LIL_ENDIAN, 4321 = BIGENDIAN */ |
#define BYTEORDER 4321 |
/* Define to 1 if you have the `bcopy' function. */ |
#undef HAVE_BCOPY |
/* Define to 1 if you have the `memmove' function. */ |
#define HAVE_MEMMOVE |
/* Define to 1 if you have a working `mmap' system call. */ |
#undef HAVE_MMAP |
/* Define to 1 if you have the <unistd.h> header file. */ |
#undef HAVE_UNISTD_H |
/* whether byteorder is bigendian */ |
#define WORDS_BIGENDIAN |
/* Define to specify how much context to retain around the current parse |
point. */ |
#undef XML_CONTEXT_BYTES |
/* Define to make parameter entity parsing functionality available. */ |
#define XML_DTD |
/* Define to make XML Namespaces functionality available. */ |
#define XML_NS |
/* Define to empty if `const' does not conform to ANSI C. */ |
#undef const |
/* Define to `long' if <sys/types.h> does not define. */ |
#define off_t long |
/* Define to `unsigned' if <sys/types.h> does not define. */ |
#undef size_t |
#endif /* ifndef MACCONFIG_H */ |
/contrib/sdk/sources/expat/lib/nametab.h |
---|
0,0 → 1,150 |
static const unsigned namingBitmap[] = { |
0x00000000, 0x00000000, 0x00000000, 0x00000000, |
0x00000000, 0x00000000, 0x00000000, 0x00000000, |
0xFFFFFFFF, 0xFFFFFFFF, 0xFFFFFFFF, 0xFFFFFFFF, |
0xFFFFFFFF, 0xFFFFFFFF, 0xFFFFFFFF, 0xFFFFFFFF, |
0x00000000, 0x04000000, 0x87FFFFFE, 0x07FFFFFE, |
0x00000000, 0x00000000, 0xFF7FFFFF, 0xFF7FFFFF, |
0xFFFFFFFF, 0x7FF3FFFF, 0xFFFFFDFE, 0x7FFFFFFF, |
0xFFFFFFFF, 0xFFFFFFFF, 0xFFFFE00F, 0xFC31FFFF, |
0x00FFFFFF, 0x00000000, 0xFFFF0000, 0xFFFFFFFF, |
0xFFFFFFFF, 0xF80001FF, 0x00000003, 0x00000000, |
0x00000000, 0x00000000, 0x00000000, 0x00000000, |
0xFFFFD740, 0xFFFFFFFB, 0x547F7FFF, 0x000FFFFD, |
0xFFFFDFFE, 0xFFFFFFFF, 0xDFFEFFFF, 0xFFFFFFFF, |
0xFFFF0003, 0xFFFFFFFF, 0xFFFF199F, 0x033FCFFF, |
0x00000000, 0xFFFE0000, 0x027FFFFF, 0xFFFFFFFE, |
0x0000007F, 0x00000000, 0xFFFF0000, 0x000707FF, |
0x00000000, 0x07FFFFFE, 0x000007FE, 0xFFFE0000, |
0xFFFFFFFF, 0x7CFFFFFF, 0x002F7FFF, 0x00000060, |
0xFFFFFFE0, 0x23FFFFFF, 0xFF000000, 0x00000003, |
0xFFF99FE0, 0x03C5FDFF, 0xB0000000, 0x00030003, |
0xFFF987E0, 0x036DFDFF, 0x5E000000, 0x001C0000, |
0xFFFBAFE0, 0x23EDFDFF, 0x00000000, 0x00000001, |
0xFFF99FE0, 0x23CDFDFF, 0xB0000000, 0x00000003, |
0xD63DC7E0, 0x03BFC718, 0x00000000, 0x00000000, |
0xFFFDDFE0, 0x03EFFDFF, 0x00000000, 0x00000003, |
0xFFFDDFE0, 0x03EFFDFF, 0x40000000, 0x00000003, |
0xFFFDDFE0, 0x03FFFDFF, 0x00000000, 0x00000003, |
0x00000000, 0x00000000, 0x00000000, 0x00000000, |
0xFFFFFFFE, 0x000D7FFF, 0x0000003F, 0x00000000, |
0xFEF02596, 0x200D6CAE, 0x0000001F, 0x00000000, |
0x00000000, 0x00000000, 0xFFFFFEFF, 0x000003FF, |
0x00000000, 0x00000000, 0x00000000, 0x00000000, |
0x00000000, 0x00000000, 0x00000000, 0x00000000, |
0x00000000, 0xFFFFFFFF, 0xFFFF003F, 0x007FFFFF, |
0x0007DAED, 0x50000000, 0x82315001, 0x002C62AB, |
0x40000000, 0xF580C900, 0x00000007, 0x02010800, |
0xFFFFFFFF, 0xFFFFFFFF, 0xFFFFFFFF, 0xFFFFFFFF, |
0x0FFFFFFF, 0xFFFFFFFF, 0xFFFFFFFF, 0x03FFFFFF, |
0x3F3FFFFF, 0xFFFFFFFF, 0xAAFF3F3F, 0x3FFFFFFF, |
0xFFFFFFFF, 0x5FDFFFFF, 0x0FCF1FDC, 0x1FDC1FFF, |
0x00000000, 0x00004C40, 0x00000000, 0x00000000, |
0x00000007, 0x00000000, 0x00000000, 0x00000000, |
0x00000080, 0x000003FE, 0xFFFFFFFE, 0xFFFFFFFF, |
0x001FFFFF, 0xFFFFFFFE, 0xFFFFFFFF, 0x07FFFFFF, |
0xFFFFFFE0, 0x00001FFF, 0x00000000, 0x00000000, |
0x00000000, 0x00000000, 0x00000000, 0x00000000, |
0xFFFFFFFF, 0xFFFFFFFF, 0xFFFFFFFF, 0xFFFFFFFF, |
0xFFFFFFFF, 0x0000003F, 0x00000000, 0x00000000, |
0xFFFFFFFF, 0xFFFFFFFF, 0xFFFFFFFF, 0xFFFFFFFF, |
0xFFFFFFFF, 0x0000000F, 0x00000000, 0x00000000, |
0x00000000, 0x07FF6000, 0x87FFFFFE, 0x07FFFFFE, |
0x00000000, 0x00800000, 0xFF7FFFFF, 0xFF7FFFFF, |
0x00FFFFFF, 0x00000000, 0xFFFF0000, 0xFFFFFFFF, |
0xFFFFFFFF, 0xF80001FF, 0x00030003, 0x00000000, |
0xFFFFFFFF, 0xFFFFFFFF, 0x0000003F, 0x00000003, |
0xFFFFD7C0, 0xFFFFFFFB, 0x547F7FFF, 0x000FFFFD, |
0xFFFFDFFE, 0xFFFFFFFF, 0xDFFEFFFF, 0xFFFFFFFF, |
0xFFFF007B, 0xFFFFFFFF, 0xFFFF199F, 0x033FCFFF, |
0x00000000, 0xFFFE0000, 0x027FFFFF, 0xFFFFFFFE, |
0xFFFE007F, 0xBBFFFFFB, 0xFFFF0016, 0x000707FF, |
0x00000000, 0x07FFFFFE, 0x0007FFFF, 0xFFFF03FF, |
0xFFFFFFFF, 0x7CFFFFFF, 0xFFEF7FFF, 0x03FF3DFF, |
0xFFFFFFEE, 0xF3FFFFFF, 0xFF1E3FFF, 0x0000FFCF, |
0xFFF99FEE, 0xD3C5FDFF, 0xB080399F, 0x0003FFCF, |
0xFFF987E4, 0xD36DFDFF, 0x5E003987, 0x001FFFC0, |
0xFFFBAFEE, 0xF3EDFDFF, 0x00003BBF, 0x0000FFC1, |
0xFFF99FEE, 0xF3CDFDFF, 0xB0C0398F, 0x0000FFC3, |
0xD63DC7EC, 0xC3BFC718, 0x00803DC7, 0x0000FF80, |
0xFFFDDFEE, 0xC3EFFDFF, 0x00603DDF, 0x0000FFC3, |
0xFFFDDFEC, 0xC3EFFDFF, 0x40603DDF, 0x0000FFC3, |
0xFFFDDFEC, 0xC3FFFDFF, 0x00803DCF, 0x0000FFC3, |
0x00000000, 0x00000000, 0x00000000, 0x00000000, |
0xFFFFFFFE, 0x07FF7FFF, 0x03FF7FFF, 0x00000000, |
0xFEF02596, 0x3BFF6CAE, 0x03FF3F5F, 0x00000000, |
0x03000000, 0xC2A003FF, 0xFFFFFEFF, 0xFFFE03FF, |
0xFEBF0FDF, 0x02FE3FFF, 0x00000000, 0x00000000, |
0x00000000, 0x00000000, 0x00000000, 0x00000000, |
0x00000000, 0x00000000, 0x1FFF0000, 0x00000002, |
0x000000A0, 0x003EFFFE, 0xFFFFFFFE, 0xFFFFFFFF, |
0x661FFFFF, 0xFFFFFFFE, 0xFFFFFFFF, 0x77FFFFFF, |
}; |
static const unsigned char nmstrtPages[] = { |
0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x00, |
0x00, 0x09, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F, |
0x10, 0x11, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, |
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x12, 0x13, |
0x00, 0x14, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, |
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, |
0x15, 0x16, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, |
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, |
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, |
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x17, |
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, |
0x00, 0x00, 0x00, 0x00, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x18, |
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, |
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, |
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, |
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, |
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, |
}; |
static const unsigned char namePages[] = { |
0x19, 0x03, 0x1A, 0x1B, 0x1C, 0x1D, 0x1E, 0x00, |
0x00, 0x1F, 0x20, 0x21, 0x22, 0x23, 0x24, 0x25, |
0x10, 0x11, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, |
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x12, 0x13, |
0x26, 0x14, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, |
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, |
0x27, 0x16, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, |
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, |
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, |
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x17, |
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, |
0x00, 0x00, 0x00, 0x00, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, |
0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x18, |
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, |
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, |
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, |
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, |
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, |
}; |
/contrib/sdk/sources/expat/lib/utf8tab.h |
---|
0,0 → 1,37 |
/* Copyright (c) 1998, 1999 Thai Open Source Software Center Ltd |
See the file COPYING for copying permission. |
*/ |
/* 0x80 */ BT_TRAIL, BT_TRAIL, BT_TRAIL, BT_TRAIL, |
/* 0x84 */ BT_TRAIL, BT_TRAIL, BT_TRAIL, BT_TRAIL, |
/* 0x88 */ BT_TRAIL, BT_TRAIL, BT_TRAIL, BT_TRAIL, |
/* 0x8C */ BT_TRAIL, BT_TRAIL, BT_TRAIL, BT_TRAIL, |
/* 0x90 */ BT_TRAIL, BT_TRAIL, BT_TRAIL, BT_TRAIL, |
/* 0x94 */ BT_TRAIL, BT_TRAIL, BT_TRAIL, BT_TRAIL, |
/* 0x98 */ BT_TRAIL, BT_TRAIL, BT_TRAIL, BT_TRAIL, |
/* 0x9C */ BT_TRAIL, BT_TRAIL, BT_TRAIL, BT_TRAIL, |
/* 0xA0 */ BT_TRAIL, BT_TRAIL, BT_TRAIL, BT_TRAIL, |
/* 0xA4 */ BT_TRAIL, BT_TRAIL, BT_TRAIL, BT_TRAIL, |
/* 0xA8 */ BT_TRAIL, BT_TRAIL, BT_TRAIL, BT_TRAIL, |
/* 0xAC */ BT_TRAIL, BT_TRAIL, BT_TRAIL, BT_TRAIL, |
/* 0xB0 */ BT_TRAIL, BT_TRAIL, BT_TRAIL, BT_TRAIL, |
/* 0xB4 */ BT_TRAIL, BT_TRAIL, BT_TRAIL, BT_TRAIL, |
/* 0xB8 */ BT_TRAIL, BT_TRAIL, BT_TRAIL, BT_TRAIL, |
/* 0xBC */ BT_TRAIL, BT_TRAIL, BT_TRAIL, BT_TRAIL, |
/* 0xC0 */ BT_LEAD2, BT_LEAD2, BT_LEAD2, BT_LEAD2, |
/* 0xC4 */ BT_LEAD2, BT_LEAD2, BT_LEAD2, BT_LEAD2, |
/* 0xC8 */ BT_LEAD2, BT_LEAD2, BT_LEAD2, BT_LEAD2, |
/* 0xCC */ BT_LEAD2, BT_LEAD2, BT_LEAD2, BT_LEAD2, |
/* 0xD0 */ BT_LEAD2, BT_LEAD2, BT_LEAD2, BT_LEAD2, |
/* 0xD4 */ BT_LEAD2, BT_LEAD2, BT_LEAD2, BT_LEAD2, |
/* 0xD8 */ BT_LEAD2, BT_LEAD2, BT_LEAD2, BT_LEAD2, |
/* 0xDC */ BT_LEAD2, BT_LEAD2, BT_LEAD2, BT_LEAD2, |
/* 0xE0 */ BT_LEAD3, BT_LEAD3, BT_LEAD3, BT_LEAD3, |
/* 0xE4 */ BT_LEAD3, BT_LEAD3, BT_LEAD3, BT_LEAD3, |
/* 0xE8 */ BT_LEAD3, BT_LEAD3, BT_LEAD3, BT_LEAD3, |
/* 0xEC */ BT_LEAD3, BT_LEAD3, BT_LEAD3, BT_LEAD3, |
/* 0xF0 */ BT_LEAD4, BT_LEAD4, BT_LEAD4, BT_LEAD4, |
/* 0xF4 */ BT_LEAD4, BT_NONXML, BT_NONXML, BT_NONXML, |
/* 0xF8 */ BT_NONXML, BT_NONXML, BT_NONXML, BT_NONXML, |
/* 0xFC */ BT_NONXML, BT_NONXML, BT_MALFORM, BT_MALFORM, |
/contrib/sdk/sources/expat/lib/winconfig.h |
---|
0,0 → 1,30 |
/*================================================================ |
** Copyright 2000, Clark Cooper |
** All rights reserved. |
** |
** This is free software. You are permitted to copy, distribute, or modify |
** it under the terms of the MIT/X license (contained in the COPYING file |
** with this distribution.) |
*/ |
#ifndef WINCONFIG_H |
#define WINCONFIG_H |
#define WIN32_LEAN_AND_MEAN |
#include <windows.h> |
#undef WIN32_LEAN_AND_MEAN |
#include <memory.h> |
#include <string.h> |
#define XML_NS 1 |
#define XML_DTD 1 |
#define XML_CONTEXT_BYTES 1024 |
/* we will assume all Windows platforms are little endian */ |
#define BYTEORDER 1234 |
/* Windows has memmove() available. */ |
#define HAVE_MEMMOVE |
#endif /* ndef WINCONFIG_H */ |
/contrib/sdk/sources/expat/lib/xmlparse.c |
---|
0,0 → 1,6403 |
/* Copyright (c) 1998, 1999, 2000 Thai Open Source Software Center Ltd |
See the file COPYING for copying permission. |
*/ |
#include <stddef.h> |
#include <string.h> /* memset(), memcpy() */ |
#include <assert.h> |
#include <limits.h> /* UINT_MAX */ |
#include <time.h> /* time() */ |
#define XML_BUILDING_EXPAT 1 |
#ifdef COMPILED_FROM_DSP |
#include "winconfig.h" |
#elif defined(MACOS_CLASSIC) |
#include "macconfig.h" |
#elif defined(__amigaos__) |
#include "amigaconfig.h" |
#elif defined(__WATCOMC__) |
#include "watcomconfig.h" |
#elif defined(HAVE_EXPAT_CONFIG_H) |
#include <expat_config.h> |
#endif /* ndef COMPILED_FROM_DSP */ |
#include "ascii.h" |
#include "expat.h" |
#ifdef XML_UNICODE |
#define XML_ENCODE_MAX XML_UTF16_ENCODE_MAX |
#define XmlConvert XmlUtf16Convert |
#define XmlGetInternalEncoding XmlGetUtf16InternalEncoding |
#define XmlGetInternalEncodingNS XmlGetUtf16InternalEncodingNS |
#define XmlEncode XmlUtf16Encode |
/* Using pointer subtraction to convert to integer type. */ |
#define MUST_CONVERT(enc, s) (!(enc)->isUtf16 || (((char *)(s) - (char *)NULL) & 1)) |
typedef unsigned short ICHAR; |
#else |
#define XML_ENCODE_MAX XML_UTF8_ENCODE_MAX |
#define XmlConvert XmlUtf8Convert |
#define XmlGetInternalEncoding XmlGetUtf8InternalEncoding |
#define XmlGetInternalEncodingNS XmlGetUtf8InternalEncodingNS |
#define XmlEncode XmlUtf8Encode |
#define MUST_CONVERT(enc, s) (!(enc)->isUtf8) |
typedef char ICHAR; |
#endif |
#ifndef XML_NS |
#define XmlInitEncodingNS XmlInitEncoding |
#define XmlInitUnknownEncodingNS XmlInitUnknownEncoding |
#undef XmlGetInternalEncodingNS |
#define XmlGetInternalEncodingNS XmlGetInternalEncoding |
#define XmlParseXmlDeclNS XmlParseXmlDecl |
#endif |
#ifdef XML_UNICODE |
#ifdef XML_UNICODE_WCHAR_T |
#define XML_T(x) (const wchar_t)x |
#define XML_L(x) L ## x |
#else |
#define XML_T(x) (const unsigned short)x |
#define XML_L(x) x |
#endif |
#else |
#define XML_T(x) x |
#define XML_L(x) x |
#endif |
/* Round up n to be a multiple of sz, where sz is a power of 2. */ |
#define ROUND_UP(n, sz) (((n) + ((sz) - 1)) & ~((sz) - 1)) |
/* Handle the case where memmove() doesn't exist. */ |
#ifndef HAVE_MEMMOVE |
#ifdef HAVE_BCOPY |
#define memmove(d,s,l) bcopy((s),(d),(l)) |
#else |
#error memmove does not exist on this platform, nor is a substitute available |
#endif /* HAVE_BCOPY */ |
#endif /* HAVE_MEMMOVE */ |
#include "internal.h" |
#include "xmltok.h" |
#include "xmlrole.h" |
typedef const XML_Char *KEY; |
typedef struct { |
KEY name; |
} NAMED; |
typedef struct { |
NAMED **v; |
unsigned char power; |
size_t size; |
size_t used; |
const XML_Memory_Handling_Suite *mem; |
} HASH_TABLE; |
/* Basic character hash algorithm, taken from Python's string hash: |
h = h * 1000003 ^ character, the constant being a prime number. |
*/ |
#ifdef XML_UNICODE |
#define CHAR_HASH(h, c) \ |
(((h) * 0xF4243) ^ (unsigned short)(c)) |
#else |
#define CHAR_HASH(h, c) \ |
(((h) * 0xF4243) ^ (unsigned char)(c)) |
#endif |
/* For probing (after a collision) we need a step size relative prime |
to the hash table size, which is a power of 2. We use double-hashing, |
since we can calculate a second hash value cheaply by taking those bits |
of the first hash value that were discarded (masked out) when the table |
index was calculated: index = hash & mask, where mask = table->size - 1. |
We limit the maximum step size to table->size / 4 (mask >> 2) and make |
it odd, since odd numbers are always relative prime to a power of 2. |
*/ |
#define SECOND_HASH(hash, mask, power) \ |
((((hash) & ~(mask)) >> ((power) - 1)) & ((mask) >> 2)) |
#define PROBE_STEP(hash, mask, power) \ |
((unsigned char)((SECOND_HASH(hash, mask, power)) | 1)) |
typedef struct { |
NAMED **p; |
NAMED **end; |
} HASH_TABLE_ITER; |
#define INIT_TAG_BUF_SIZE 32 /* must be a multiple of sizeof(XML_Char) */ |
#define INIT_DATA_BUF_SIZE 1024 |
#define INIT_ATTS_SIZE 16 |
#define INIT_ATTS_VERSION 0xFFFFFFFF |
#define INIT_BLOCK_SIZE 1024 |
#define INIT_BUFFER_SIZE 1024 |
#define EXPAND_SPARE 24 |
typedef struct binding { |
struct prefix *prefix; |
struct binding *nextTagBinding; |
struct binding *prevPrefixBinding; |
const struct attribute_id *attId; |
XML_Char *uri; |
int uriLen; |
int uriAlloc; |
} BINDING; |
typedef struct prefix { |
const XML_Char *name; |
BINDING *binding; |
} PREFIX; |
typedef struct { |
const XML_Char *str; |
const XML_Char *localPart; |
const XML_Char *prefix; |
int strLen; |
int uriLen; |
int prefixLen; |
} TAG_NAME; |
/* TAG represents an open element. |
The name of the element is stored in both the document and API |
encodings. The memory buffer 'buf' is a separately-allocated |
memory area which stores the name. During the XML_Parse()/ |
XMLParseBuffer() when the element is open, the memory for the 'raw' |
version of the name (in the document encoding) is shared with the |
document buffer. If the element is open across calls to |
XML_Parse()/XML_ParseBuffer(), the buffer is re-allocated to |
contain the 'raw' name as well. |
A parser re-uses these structures, maintaining a list of allocated |
TAG objects in a free list. |
*/ |
typedef struct tag { |
struct tag *parent; /* parent of this element */ |
const char *rawName; /* tagName in the original encoding */ |
int rawNameLength; |
TAG_NAME name; /* tagName in the API encoding */ |
char *buf; /* buffer for name components */ |
char *bufEnd; /* end of the buffer */ |
BINDING *bindings; |
} TAG; |
typedef struct { |
const XML_Char *name; |
const XML_Char *textPtr; |
int textLen; /* length in XML_Chars */ |
int processed; /* # of processed bytes - when suspended */ |
const XML_Char *systemId; |
const XML_Char *base; |
const XML_Char *publicId; |
const XML_Char *notation; |
XML_Bool open; |
XML_Bool is_param; |
XML_Bool is_internal; /* true if declared in internal subset outside PE */ |
} ENTITY; |
typedef struct { |
enum XML_Content_Type type; |
enum XML_Content_Quant quant; |
const XML_Char * name; |
int firstchild; |
int lastchild; |
int childcnt; |
int nextsib; |
} CONTENT_SCAFFOLD; |
#define INIT_SCAFFOLD_ELEMENTS 32 |
typedef struct block { |
struct block *next; |
int size; |
XML_Char s[1]; |
} BLOCK; |
typedef struct { |
BLOCK *blocks; |
BLOCK *freeBlocks; |
const XML_Char *end; |
XML_Char *ptr; |
XML_Char *start; |
const XML_Memory_Handling_Suite *mem; |
} STRING_POOL; |
/* The XML_Char before the name is used to determine whether |
an attribute has been specified. */ |
typedef struct attribute_id { |
XML_Char *name; |
PREFIX *prefix; |
XML_Bool maybeTokenized; |
XML_Bool xmlns; |
} ATTRIBUTE_ID; |
typedef struct { |
const ATTRIBUTE_ID *id; |
XML_Bool isCdata; |
const XML_Char *value; |
} DEFAULT_ATTRIBUTE; |
typedef struct { |
unsigned long version; |
unsigned long hash; |
const XML_Char *uriName; |
} NS_ATT; |
typedef struct { |
const XML_Char *name; |
PREFIX *prefix; |
const ATTRIBUTE_ID *idAtt; |
int nDefaultAtts; |
int allocDefaultAtts; |
DEFAULT_ATTRIBUTE *defaultAtts; |
} ELEMENT_TYPE; |
typedef struct { |
HASH_TABLE generalEntities; |
HASH_TABLE elementTypes; |
HASH_TABLE attributeIds; |
HASH_TABLE prefixes; |
STRING_POOL pool; |
STRING_POOL entityValuePool; |
/* false once a parameter entity reference has been skipped */ |
XML_Bool keepProcessing; |
/* true once an internal or external PE reference has been encountered; |
this includes the reference to an external subset */ |
XML_Bool hasParamEntityRefs; |
XML_Bool standalone; |
#ifdef XML_DTD |
/* indicates if external PE has been read */ |
XML_Bool paramEntityRead; |
HASH_TABLE paramEntities; |
#endif /* XML_DTD */ |
PREFIX defaultPrefix; |
/* === scaffolding for building content model === */ |
XML_Bool in_eldecl; |
CONTENT_SCAFFOLD *scaffold; |
unsigned contentStringLen; |
unsigned scaffSize; |
unsigned scaffCount; |
int scaffLevel; |
int *scaffIndex; |
} DTD; |
typedef struct open_internal_entity { |
const char *internalEventPtr; |
const char *internalEventEndPtr; |
struct open_internal_entity *next; |
ENTITY *entity; |
int startTagLevel; |
XML_Bool betweenDecl; /* WFC: PE Between Declarations */ |
} OPEN_INTERNAL_ENTITY; |
typedef enum XML_Error PTRCALL Processor(XML_Parser parser, |
const char *start, |
const char *end, |
const char **endPtr); |
static Processor prologProcessor; |
static Processor prologInitProcessor; |
static Processor contentProcessor; |
static Processor cdataSectionProcessor; |
#ifdef XML_DTD |
static Processor ignoreSectionProcessor; |
static Processor externalParEntProcessor; |
static Processor externalParEntInitProcessor; |
static Processor entityValueProcessor; |
static Processor entityValueInitProcessor; |
#endif /* XML_DTD */ |
static Processor epilogProcessor; |
static Processor errorProcessor; |
static Processor externalEntityInitProcessor; |
static Processor externalEntityInitProcessor2; |
static Processor externalEntityInitProcessor3; |
static Processor externalEntityContentProcessor; |
static Processor internalEntityProcessor; |
static enum XML_Error |
handleUnknownEncoding(XML_Parser parser, const XML_Char *encodingName); |
static enum XML_Error |
processXmlDecl(XML_Parser parser, int isGeneralTextEntity, |
const char *s, const char *next); |
static enum XML_Error |
initializeEncoding(XML_Parser parser); |
static enum XML_Error |
doProlog(XML_Parser parser, const ENCODING *enc, const char *s, |
const char *end, int tok, const char *next, const char **nextPtr, |
XML_Bool haveMore); |
static enum XML_Error |
processInternalEntity(XML_Parser parser, ENTITY *entity, |
XML_Bool betweenDecl); |
static enum XML_Error |
doContent(XML_Parser parser, int startTagLevel, const ENCODING *enc, |
const char *start, const char *end, const char **endPtr, |
XML_Bool haveMore); |
static enum XML_Error |
doCdataSection(XML_Parser parser, const ENCODING *, const char **startPtr, |
const char *end, const char **nextPtr, XML_Bool haveMore); |
#ifdef XML_DTD |
static enum XML_Error |
doIgnoreSection(XML_Parser parser, const ENCODING *, const char **startPtr, |
const char *end, const char **nextPtr, XML_Bool haveMore); |
#endif /* XML_DTD */ |
static enum XML_Error |
storeAtts(XML_Parser parser, const ENCODING *, const char *s, |
TAG_NAME *tagNamePtr, BINDING **bindingsPtr); |
static enum XML_Error |
addBinding(XML_Parser parser, PREFIX *prefix, const ATTRIBUTE_ID *attId, |
const XML_Char *uri, BINDING **bindingsPtr); |
static int |
defineAttribute(ELEMENT_TYPE *type, ATTRIBUTE_ID *, XML_Bool isCdata, |
XML_Bool isId, const XML_Char *dfltValue, XML_Parser parser); |
static enum XML_Error |
storeAttributeValue(XML_Parser parser, const ENCODING *, XML_Bool isCdata, |
const char *, const char *, STRING_POOL *); |
static enum XML_Error |
appendAttributeValue(XML_Parser parser, const ENCODING *, XML_Bool isCdata, |
const char *, const char *, STRING_POOL *); |
static ATTRIBUTE_ID * |
getAttributeId(XML_Parser parser, const ENCODING *enc, const char *start, |
const char *end); |
static int |
setElementTypePrefix(XML_Parser parser, ELEMENT_TYPE *); |
static enum XML_Error |
storeEntityValue(XML_Parser parser, const ENCODING *enc, const char *start, |
const char *end); |
static int |
reportProcessingInstruction(XML_Parser parser, const ENCODING *enc, |
const char *start, const char *end); |
static int |
reportComment(XML_Parser parser, const ENCODING *enc, const char *start, |
const char *end); |
static void |
reportDefault(XML_Parser parser, const ENCODING *enc, const char *start, |
const char *end); |
static const XML_Char * getContext(XML_Parser parser); |
static XML_Bool |
setContext(XML_Parser parser, const XML_Char *context); |
static void FASTCALL normalizePublicId(XML_Char *s); |
static DTD * dtdCreate(const XML_Memory_Handling_Suite *ms); |
/* do not call if parentParser != NULL */ |
static void dtdReset(DTD *p, const XML_Memory_Handling_Suite *ms); |
static void |
dtdDestroy(DTD *p, XML_Bool isDocEntity, const XML_Memory_Handling_Suite *ms); |
static int |
dtdCopy(XML_Parser oldParser, |
DTD *newDtd, const DTD *oldDtd, const XML_Memory_Handling_Suite *ms); |
static int |
copyEntityTable(XML_Parser oldParser, |
HASH_TABLE *, STRING_POOL *, const HASH_TABLE *); |
static NAMED * |
lookup(XML_Parser parser, HASH_TABLE *table, KEY name, size_t createSize); |
static void FASTCALL |
hashTableInit(HASH_TABLE *, const XML_Memory_Handling_Suite *ms); |
static void FASTCALL hashTableClear(HASH_TABLE *); |
static void FASTCALL hashTableDestroy(HASH_TABLE *); |
static void FASTCALL |
hashTableIterInit(HASH_TABLE_ITER *, const HASH_TABLE *); |
static NAMED * FASTCALL hashTableIterNext(HASH_TABLE_ITER *); |
static void FASTCALL |
poolInit(STRING_POOL *, const XML_Memory_Handling_Suite *ms); |
static void FASTCALL poolClear(STRING_POOL *); |
static void FASTCALL poolDestroy(STRING_POOL *); |
static XML_Char * |
poolAppend(STRING_POOL *pool, const ENCODING *enc, |
const char *ptr, const char *end); |
static XML_Char * |
poolStoreString(STRING_POOL *pool, const ENCODING *enc, |
const char *ptr, const char *end); |
static XML_Bool FASTCALL poolGrow(STRING_POOL *pool); |
static const XML_Char * FASTCALL |
poolCopyString(STRING_POOL *pool, const XML_Char *s); |
static const XML_Char * |
poolCopyStringN(STRING_POOL *pool, const XML_Char *s, int n); |
static const XML_Char * FASTCALL |
poolAppendString(STRING_POOL *pool, const XML_Char *s); |
static int FASTCALL nextScaffoldPart(XML_Parser parser); |
static XML_Content * build_model(XML_Parser parser); |
static ELEMENT_TYPE * |
getElementType(XML_Parser parser, const ENCODING *enc, |
const char *ptr, const char *end); |
static unsigned long generate_hash_secret_salt(void); |
static XML_Bool startParsing(XML_Parser parser); |
static XML_Parser |
parserCreate(const XML_Char *encodingName, |
const XML_Memory_Handling_Suite *memsuite, |
const XML_Char *nameSep, |
DTD *dtd); |
static void |
parserInit(XML_Parser parser, const XML_Char *encodingName); |
#define poolStart(pool) ((pool)->start) |
#define poolEnd(pool) ((pool)->ptr) |
#define poolLength(pool) ((pool)->ptr - (pool)->start) |
#define poolChop(pool) ((void)--(pool->ptr)) |
#define poolLastChar(pool) (((pool)->ptr)[-1]) |
#define poolDiscard(pool) ((pool)->ptr = (pool)->start) |
#define poolFinish(pool) ((pool)->start = (pool)->ptr) |
#define poolAppendChar(pool, c) \ |
(((pool)->ptr == (pool)->end && !poolGrow(pool)) \ |
? 0 \ |
: ((*((pool)->ptr)++ = c), 1)) |
struct XML_ParserStruct { |
/* The first member must be userData so that the XML_GetUserData |
macro works. */ |
void *m_userData; |
void *m_handlerArg; |
char *m_buffer; |
const XML_Memory_Handling_Suite m_mem; |
/* first character to be parsed */ |
const char *m_bufferPtr; |
/* past last character to be parsed */ |
char *m_bufferEnd; |
/* allocated end of buffer */ |
const char *m_bufferLim; |
XML_Index m_parseEndByteIndex; |
const char *m_parseEndPtr; |
XML_Char *m_dataBuf; |
XML_Char *m_dataBufEnd; |
XML_StartElementHandler m_startElementHandler; |
XML_EndElementHandler m_endElementHandler; |
XML_CharacterDataHandler m_characterDataHandler; |
XML_ProcessingInstructionHandler m_processingInstructionHandler; |
XML_CommentHandler m_commentHandler; |
XML_StartCdataSectionHandler m_startCdataSectionHandler; |
XML_EndCdataSectionHandler m_endCdataSectionHandler; |
XML_DefaultHandler m_defaultHandler; |
XML_StartDoctypeDeclHandler m_startDoctypeDeclHandler; |
XML_EndDoctypeDeclHandler m_endDoctypeDeclHandler; |
XML_UnparsedEntityDeclHandler m_unparsedEntityDeclHandler; |
XML_NotationDeclHandler m_notationDeclHandler; |
XML_StartNamespaceDeclHandler m_startNamespaceDeclHandler; |
XML_EndNamespaceDeclHandler m_endNamespaceDeclHandler; |
XML_NotStandaloneHandler m_notStandaloneHandler; |
XML_ExternalEntityRefHandler m_externalEntityRefHandler; |
XML_Parser m_externalEntityRefHandlerArg; |
XML_SkippedEntityHandler m_skippedEntityHandler; |
XML_UnknownEncodingHandler m_unknownEncodingHandler; |
XML_ElementDeclHandler m_elementDeclHandler; |
XML_AttlistDeclHandler m_attlistDeclHandler; |
XML_EntityDeclHandler m_entityDeclHandler; |
XML_XmlDeclHandler m_xmlDeclHandler; |
const ENCODING *m_encoding; |
INIT_ENCODING m_initEncoding; |
const ENCODING *m_internalEncoding; |
const XML_Char *m_protocolEncodingName; |
XML_Bool m_ns; |
XML_Bool m_ns_triplets; |
void *m_unknownEncodingMem; |
void *m_unknownEncodingData; |
void *m_unknownEncodingHandlerData; |
void (XMLCALL *m_unknownEncodingRelease)(void *); |
PROLOG_STATE m_prologState; |
Processor *m_processor; |
enum XML_Error m_errorCode; |
const char *m_eventPtr; |
const char *m_eventEndPtr; |
const char *m_positionPtr; |
OPEN_INTERNAL_ENTITY *m_openInternalEntities; |
OPEN_INTERNAL_ENTITY *m_freeInternalEntities; |
XML_Bool m_defaultExpandInternalEntities; |
int m_tagLevel; |
ENTITY *m_declEntity; |
const XML_Char *m_doctypeName; |
const XML_Char *m_doctypeSysid; |
const XML_Char *m_doctypePubid; |
const XML_Char *m_declAttributeType; |
const XML_Char *m_declNotationName; |
const XML_Char *m_declNotationPublicId; |
ELEMENT_TYPE *m_declElementType; |
ATTRIBUTE_ID *m_declAttributeId; |
XML_Bool m_declAttributeIsCdata; |
XML_Bool m_declAttributeIsId; |
DTD *m_dtd; |
const XML_Char *m_curBase; |
TAG *m_tagStack; |
TAG *m_freeTagList; |
BINDING *m_inheritedBindings; |
BINDING *m_freeBindingList; |
int m_attsSize; |
int m_nSpecifiedAtts; |
int m_idAttIndex; |
ATTRIBUTE *m_atts; |
NS_ATT *m_nsAtts; |
unsigned long m_nsAttsVersion; |
unsigned char m_nsAttsPower; |
#ifdef XML_ATTR_INFO |
XML_AttrInfo *m_attInfo; |
#endif |
POSITION m_position; |
STRING_POOL m_tempPool; |
STRING_POOL m_temp2Pool; |
char *m_groupConnector; |
unsigned int m_groupSize; |
XML_Char m_namespaceSeparator; |
XML_Parser m_parentParser; |
XML_ParsingStatus m_parsingStatus; |
#ifdef XML_DTD |
XML_Bool m_isParamEntity; |
XML_Bool m_useForeignDTD; |
enum XML_ParamEntityParsing m_paramEntityParsing; |
#endif |
unsigned long m_hash_secret_salt; |
}; |
#define MALLOC(s) (parser->m_mem.malloc_fcn((s))) |
#define REALLOC(p,s) (parser->m_mem.realloc_fcn((p),(s))) |
#define FREE(p) (parser->m_mem.free_fcn((p))) |
#define userData (parser->m_userData) |
#define handlerArg (parser->m_handlerArg) |
#define startElementHandler (parser->m_startElementHandler) |
#define endElementHandler (parser->m_endElementHandler) |
#define characterDataHandler (parser->m_characterDataHandler) |
#define processingInstructionHandler \ |
(parser->m_processingInstructionHandler) |
#define commentHandler (parser->m_commentHandler) |
#define startCdataSectionHandler \ |
(parser->m_startCdataSectionHandler) |
#define endCdataSectionHandler (parser->m_endCdataSectionHandler) |
#define defaultHandler (parser->m_defaultHandler) |
#define startDoctypeDeclHandler (parser->m_startDoctypeDeclHandler) |
#define endDoctypeDeclHandler (parser->m_endDoctypeDeclHandler) |
#define unparsedEntityDeclHandler \ |
(parser->m_unparsedEntityDeclHandler) |
#define notationDeclHandler (parser->m_notationDeclHandler) |
#define startNamespaceDeclHandler \ |
(parser->m_startNamespaceDeclHandler) |
#define endNamespaceDeclHandler (parser->m_endNamespaceDeclHandler) |
#define notStandaloneHandler (parser->m_notStandaloneHandler) |
#define externalEntityRefHandler \ |
(parser->m_externalEntityRefHandler) |
#define externalEntityRefHandlerArg \ |
(parser->m_externalEntityRefHandlerArg) |
#define internalEntityRefHandler \ |
(parser->m_internalEntityRefHandler) |
#define skippedEntityHandler (parser->m_skippedEntityHandler) |
#define unknownEncodingHandler (parser->m_unknownEncodingHandler) |
#define elementDeclHandler (parser->m_elementDeclHandler) |
#define attlistDeclHandler (parser->m_attlistDeclHandler) |
#define entityDeclHandler (parser->m_entityDeclHandler) |
#define xmlDeclHandler (parser->m_xmlDeclHandler) |
#define encoding (parser->m_encoding) |
#define initEncoding (parser->m_initEncoding) |
#define internalEncoding (parser->m_internalEncoding) |
#define unknownEncodingMem (parser->m_unknownEncodingMem) |
#define unknownEncodingData (parser->m_unknownEncodingData) |
#define unknownEncodingHandlerData \ |
(parser->m_unknownEncodingHandlerData) |
#define unknownEncodingRelease (parser->m_unknownEncodingRelease) |
#define protocolEncodingName (parser->m_protocolEncodingName) |
#define ns (parser->m_ns) |
#define ns_triplets (parser->m_ns_triplets) |
#define prologState (parser->m_prologState) |
#define processor (parser->m_processor) |
#define errorCode (parser->m_errorCode) |
#define eventPtr (parser->m_eventPtr) |
#define eventEndPtr (parser->m_eventEndPtr) |
#define positionPtr (parser->m_positionPtr) |
#define position (parser->m_position) |
#define openInternalEntities (parser->m_openInternalEntities) |
#define freeInternalEntities (parser->m_freeInternalEntities) |
#define defaultExpandInternalEntities \ |
(parser->m_defaultExpandInternalEntities) |
#define tagLevel (parser->m_tagLevel) |
#define buffer (parser->m_buffer) |
#define bufferPtr (parser->m_bufferPtr) |
#define bufferEnd (parser->m_bufferEnd) |
#define parseEndByteIndex (parser->m_parseEndByteIndex) |
#define parseEndPtr (parser->m_parseEndPtr) |
#define bufferLim (parser->m_bufferLim) |
#define dataBuf (parser->m_dataBuf) |
#define dataBufEnd (parser->m_dataBufEnd) |
#define _dtd (parser->m_dtd) |
#define curBase (parser->m_curBase) |
#define declEntity (parser->m_declEntity) |
#define doctypeName (parser->m_doctypeName) |
#define doctypeSysid (parser->m_doctypeSysid) |
#define doctypePubid (parser->m_doctypePubid) |
#define declAttributeType (parser->m_declAttributeType) |
#define declNotationName (parser->m_declNotationName) |
#define declNotationPublicId (parser->m_declNotationPublicId) |
#define declElementType (parser->m_declElementType) |
#define declAttributeId (parser->m_declAttributeId) |
#define declAttributeIsCdata (parser->m_declAttributeIsCdata) |
#define declAttributeIsId (parser->m_declAttributeIsId) |
#define freeTagList (parser->m_freeTagList) |
#define freeBindingList (parser->m_freeBindingList) |
#define inheritedBindings (parser->m_inheritedBindings) |
#define tagStack (parser->m_tagStack) |
#define atts (parser->m_atts) |
#define attsSize (parser->m_attsSize) |
#define nSpecifiedAtts (parser->m_nSpecifiedAtts) |
#define idAttIndex (parser->m_idAttIndex) |
#define nsAtts (parser->m_nsAtts) |
#define nsAttsVersion (parser->m_nsAttsVersion) |
#define nsAttsPower (parser->m_nsAttsPower) |
#define attInfo (parser->m_attInfo) |
#define tempPool (parser->m_tempPool) |
#define temp2Pool (parser->m_temp2Pool) |
#define groupConnector (parser->m_groupConnector) |
#define groupSize (parser->m_groupSize) |
#define namespaceSeparator (parser->m_namespaceSeparator) |
#define parentParser (parser->m_parentParser) |
#define ps_parsing (parser->m_parsingStatus.parsing) |
#define ps_finalBuffer (parser->m_parsingStatus.finalBuffer) |
#ifdef XML_DTD |
#define isParamEntity (parser->m_isParamEntity) |
#define useForeignDTD (parser->m_useForeignDTD) |
#define paramEntityParsing (parser->m_paramEntityParsing) |
#endif /* XML_DTD */ |
#define hash_secret_salt (parser->m_hash_secret_salt) |
XML_Parser XMLCALL |
XML_ParserCreate(const XML_Char *encodingName) |
{ |
return XML_ParserCreate_MM(encodingName, NULL, NULL); |
} |
XML_Parser XMLCALL |
XML_ParserCreateNS(const XML_Char *encodingName, XML_Char nsSep) |
{ |
XML_Char tmp[2]; |
*tmp = nsSep; |
return XML_ParserCreate_MM(encodingName, NULL, tmp); |
} |
static const XML_Char implicitContext[] = { |
ASCII_x, ASCII_m, ASCII_l, ASCII_EQUALS, ASCII_h, ASCII_t, ASCII_t, ASCII_p, |
ASCII_COLON, ASCII_SLASH, ASCII_SLASH, ASCII_w, ASCII_w, ASCII_w, |
ASCII_PERIOD, ASCII_w, ASCII_3, ASCII_PERIOD, ASCII_o, ASCII_r, ASCII_g, |
ASCII_SLASH, ASCII_X, ASCII_M, ASCII_L, ASCII_SLASH, ASCII_1, ASCII_9, |
ASCII_9, ASCII_8, ASCII_SLASH, ASCII_n, ASCII_a, ASCII_m, ASCII_e, |
ASCII_s, ASCII_p, ASCII_a, ASCII_c, ASCII_e, '\0' |
}; |
static unsigned long |
generate_hash_secret_salt(void) |
{ |
unsigned int seed = time(NULL) % UINT_MAX; |
srand(seed); |
return rand(); |
} |
static XML_Bool /* only valid for root parser */ |
startParsing(XML_Parser parser) |
{ |
/* hash functions must be initialized before setContext() is called */ |
if (hash_secret_salt == 0) |
hash_secret_salt = generate_hash_secret_salt(); |
if (ns) { |
/* implicit context only set for root parser, since child |
parsers (i.e. external entity parsers) will inherit it |
*/ |
return setContext(parser, implicitContext); |
} |
return XML_TRUE; |
} |
XML_Parser XMLCALL |
XML_ParserCreate_MM(const XML_Char *encodingName, |
const XML_Memory_Handling_Suite *memsuite, |
const XML_Char *nameSep) |
{ |
return parserCreate(encodingName, memsuite, nameSep, NULL); |
} |
static XML_Parser |
parserCreate(const XML_Char *encodingName, |
const XML_Memory_Handling_Suite *memsuite, |
const XML_Char *nameSep, |
DTD *dtd) |
{ |
XML_Parser parser; |
if (memsuite) { |
XML_Memory_Handling_Suite *mtemp; |
parser = (XML_Parser) |
memsuite->malloc_fcn(sizeof(struct XML_ParserStruct)); |
if (parser != NULL) { |
mtemp = (XML_Memory_Handling_Suite *)&(parser->m_mem); |
mtemp->malloc_fcn = memsuite->malloc_fcn; |
mtemp->realloc_fcn = memsuite->realloc_fcn; |
mtemp->free_fcn = memsuite->free_fcn; |
} |
} |
else { |
XML_Memory_Handling_Suite *mtemp; |
parser = (XML_Parser)malloc(sizeof(struct XML_ParserStruct)); |
if (parser != NULL) { |
mtemp = (XML_Memory_Handling_Suite *)&(parser->m_mem); |
mtemp->malloc_fcn = malloc; |
mtemp->realloc_fcn = realloc; |
mtemp->free_fcn = free; |
} |
} |
if (!parser) |
return parser; |
buffer = NULL; |
bufferLim = NULL; |
attsSize = INIT_ATTS_SIZE; |
atts = (ATTRIBUTE *)MALLOC(attsSize * sizeof(ATTRIBUTE)); |
if (atts == NULL) { |
FREE(parser); |
return NULL; |
} |
#ifdef XML_ATTR_INFO |
attInfo = (XML_AttrInfo*)MALLOC(attsSize * sizeof(XML_AttrInfo)); |
if (attInfo == NULL) { |
FREE(atts); |
FREE(parser); |
return NULL; |
} |
#endif |
dataBuf = (XML_Char *)MALLOC(INIT_DATA_BUF_SIZE * sizeof(XML_Char)); |
if (dataBuf == NULL) { |
FREE(atts); |
#ifdef XML_ATTR_INFO |
FREE(attInfo); |
#endif |
FREE(parser); |
return NULL; |
} |
dataBufEnd = dataBuf + INIT_DATA_BUF_SIZE; |
if (dtd) |
_dtd = dtd; |
else { |
_dtd = dtdCreate(&parser->m_mem); |
if (_dtd == NULL) { |
FREE(dataBuf); |
FREE(atts); |
#ifdef XML_ATTR_INFO |
FREE(attInfo); |
#endif |
FREE(parser); |
return NULL; |
} |
} |
freeBindingList = NULL; |
freeTagList = NULL; |
freeInternalEntities = NULL; |
groupSize = 0; |
groupConnector = NULL; |
unknownEncodingHandler = NULL; |
unknownEncodingHandlerData = NULL; |
namespaceSeparator = ASCII_EXCL; |
ns = XML_FALSE; |
ns_triplets = XML_FALSE; |
nsAtts = NULL; |
nsAttsVersion = 0; |
nsAttsPower = 0; |
poolInit(&tempPool, &(parser->m_mem)); |
poolInit(&temp2Pool, &(parser->m_mem)); |
parserInit(parser, encodingName); |
if (encodingName && !protocolEncodingName) { |
XML_ParserFree(parser); |
return NULL; |
} |
if (nameSep) { |
ns = XML_TRUE; |
internalEncoding = XmlGetInternalEncodingNS(); |
namespaceSeparator = *nameSep; |
} |
else { |
internalEncoding = XmlGetInternalEncoding(); |
} |
return parser; |
} |
static void |
parserInit(XML_Parser parser, const XML_Char *encodingName) |
{ |
processor = prologInitProcessor; |
XmlPrologStateInit(&prologState); |
protocolEncodingName = (encodingName != NULL |
? poolCopyString(&tempPool, encodingName) |
: NULL); |
curBase = NULL; |
XmlInitEncoding(&initEncoding, &encoding, 0); |
userData = NULL; |
handlerArg = NULL; |
startElementHandler = NULL; |
endElementHandler = NULL; |
characterDataHandler = NULL; |
processingInstructionHandler = NULL; |
commentHandler = NULL; |
startCdataSectionHandler = NULL; |
endCdataSectionHandler = NULL; |
defaultHandler = NULL; |
startDoctypeDeclHandler = NULL; |
endDoctypeDeclHandler = NULL; |
unparsedEntityDeclHandler = NULL; |
notationDeclHandler = NULL; |
startNamespaceDeclHandler = NULL; |
endNamespaceDeclHandler = NULL; |
notStandaloneHandler = NULL; |
externalEntityRefHandler = NULL; |
externalEntityRefHandlerArg = parser; |
skippedEntityHandler = NULL; |
elementDeclHandler = NULL; |
attlistDeclHandler = NULL; |
entityDeclHandler = NULL; |
xmlDeclHandler = NULL; |
bufferPtr = buffer; |
bufferEnd = buffer; |
parseEndByteIndex = 0; |
parseEndPtr = NULL; |
declElementType = NULL; |
declAttributeId = NULL; |
declEntity = NULL; |
doctypeName = NULL; |
doctypeSysid = NULL; |
doctypePubid = NULL; |
declAttributeType = NULL; |
declNotationName = NULL; |
declNotationPublicId = NULL; |
declAttributeIsCdata = XML_FALSE; |
declAttributeIsId = XML_FALSE; |
memset(&position, 0, sizeof(POSITION)); |
errorCode = XML_ERROR_NONE; |
eventPtr = NULL; |
eventEndPtr = NULL; |
positionPtr = NULL; |
openInternalEntities = NULL; |
defaultExpandInternalEntities = XML_TRUE; |
tagLevel = 0; |
tagStack = NULL; |
inheritedBindings = NULL; |
nSpecifiedAtts = 0; |
unknownEncodingMem = NULL; |
unknownEncodingRelease = NULL; |
unknownEncodingData = NULL; |
parentParser = NULL; |
ps_parsing = XML_INITIALIZED; |
#ifdef XML_DTD |
isParamEntity = XML_FALSE; |
useForeignDTD = XML_FALSE; |
paramEntityParsing = XML_PARAM_ENTITY_PARSING_NEVER; |
#endif |
hash_secret_salt = 0; |
} |
/* moves list of bindings to freeBindingList */ |
static void FASTCALL |
moveToFreeBindingList(XML_Parser parser, BINDING *bindings) |
{ |
while (bindings) { |
BINDING *b = bindings; |
bindings = bindings->nextTagBinding; |
b->nextTagBinding = freeBindingList; |
freeBindingList = b; |
} |
} |
XML_Bool XMLCALL |
XML_ParserReset(XML_Parser parser, const XML_Char *encodingName) |
{ |
TAG *tStk; |
OPEN_INTERNAL_ENTITY *openEntityList; |
if (parentParser) |
return XML_FALSE; |
/* move tagStack to freeTagList */ |
tStk = tagStack; |
while (tStk) { |
TAG *tag = tStk; |
tStk = tStk->parent; |
tag->parent = freeTagList; |
moveToFreeBindingList(parser, tag->bindings); |
tag->bindings = NULL; |
freeTagList = tag; |
} |
/* move openInternalEntities to freeInternalEntities */ |
openEntityList = openInternalEntities; |
while (openEntityList) { |
OPEN_INTERNAL_ENTITY *openEntity = openEntityList; |
openEntityList = openEntity->next; |
openEntity->next = freeInternalEntities; |
freeInternalEntities = openEntity; |
} |
moveToFreeBindingList(parser, inheritedBindings); |
FREE(unknownEncodingMem); |
if (unknownEncodingRelease) |
unknownEncodingRelease(unknownEncodingData); |
poolClear(&tempPool); |
poolClear(&temp2Pool); |
parserInit(parser, encodingName); |
dtdReset(_dtd, &parser->m_mem); |
return XML_TRUE; |
} |
enum XML_Status XMLCALL |
XML_SetEncoding(XML_Parser parser, const XML_Char *encodingName) |
{ |
/* Block after XML_Parse()/XML_ParseBuffer() has been called. |
XXX There's no way for the caller to determine which of the |
XXX possible error cases caused the XML_STATUS_ERROR return. |
*/ |
if (ps_parsing == XML_PARSING || ps_parsing == XML_SUSPENDED) |
return XML_STATUS_ERROR; |
if (encodingName == NULL) |
protocolEncodingName = NULL; |
else { |
protocolEncodingName = poolCopyString(&tempPool, encodingName); |
if (!protocolEncodingName) |
return XML_STATUS_ERROR; |
} |
return XML_STATUS_OK; |
} |
XML_Parser XMLCALL |
XML_ExternalEntityParserCreate(XML_Parser oldParser, |
const XML_Char *context, |
const XML_Char *encodingName) |
{ |
XML_Parser parser = oldParser; |
DTD *newDtd = NULL; |
DTD *oldDtd = _dtd; |
XML_StartElementHandler oldStartElementHandler = startElementHandler; |
XML_EndElementHandler oldEndElementHandler = endElementHandler; |
XML_CharacterDataHandler oldCharacterDataHandler = characterDataHandler; |
XML_ProcessingInstructionHandler oldProcessingInstructionHandler |
= processingInstructionHandler; |
XML_CommentHandler oldCommentHandler = commentHandler; |
XML_StartCdataSectionHandler oldStartCdataSectionHandler |
= startCdataSectionHandler; |
XML_EndCdataSectionHandler oldEndCdataSectionHandler |
= endCdataSectionHandler; |
XML_DefaultHandler oldDefaultHandler = defaultHandler; |
XML_UnparsedEntityDeclHandler oldUnparsedEntityDeclHandler |
= unparsedEntityDeclHandler; |
XML_NotationDeclHandler oldNotationDeclHandler = notationDeclHandler; |
XML_StartNamespaceDeclHandler oldStartNamespaceDeclHandler |
= startNamespaceDeclHandler; |
XML_EndNamespaceDeclHandler oldEndNamespaceDeclHandler |
= endNamespaceDeclHandler; |
XML_NotStandaloneHandler oldNotStandaloneHandler = notStandaloneHandler; |
XML_ExternalEntityRefHandler oldExternalEntityRefHandler |
= externalEntityRefHandler; |
XML_SkippedEntityHandler oldSkippedEntityHandler = skippedEntityHandler; |
XML_UnknownEncodingHandler oldUnknownEncodingHandler |
= unknownEncodingHandler; |
XML_ElementDeclHandler oldElementDeclHandler = elementDeclHandler; |
XML_AttlistDeclHandler oldAttlistDeclHandler = attlistDeclHandler; |
XML_EntityDeclHandler oldEntityDeclHandler = entityDeclHandler; |
XML_XmlDeclHandler oldXmlDeclHandler = xmlDeclHandler; |
ELEMENT_TYPE * oldDeclElementType = declElementType; |
void *oldUserData = userData; |
void *oldHandlerArg = handlerArg; |
XML_Bool oldDefaultExpandInternalEntities = defaultExpandInternalEntities; |
XML_Parser oldExternalEntityRefHandlerArg = externalEntityRefHandlerArg; |
#ifdef XML_DTD |
enum XML_ParamEntityParsing oldParamEntityParsing = paramEntityParsing; |
int oldInEntityValue = prologState.inEntityValue; |
#endif |
XML_Bool oldns_triplets = ns_triplets; |
/* Note that the new parser shares the same hash secret as the old |
parser, so that dtdCopy and copyEntityTable can lookup values |
from hash tables associated with either parser without us having |
to worry which hash secrets each table has. |
*/ |
unsigned long oldhash_secret_salt = hash_secret_salt; |
#ifdef XML_DTD |
if (!context) |
newDtd = oldDtd; |
#endif /* XML_DTD */ |
/* Note that the magical uses of the pre-processor to make field |
access look more like C++ require that `parser' be overwritten |
here. This makes this function more painful to follow than it |
would be otherwise. |
*/ |
if (ns) { |
XML_Char tmp[2]; |
*tmp = namespaceSeparator; |
parser = parserCreate(encodingName, &parser->m_mem, tmp, newDtd); |
} |
else { |
parser = parserCreate(encodingName, &parser->m_mem, NULL, newDtd); |
} |
if (!parser) |
return NULL; |
startElementHandler = oldStartElementHandler; |
endElementHandler = oldEndElementHandler; |
characterDataHandler = oldCharacterDataHandler; |
processingInstructionHandler = oldProcessingInstructionHandler; |
commentHandler = oldCommentHandler; |
startCdataSectionHandler = oldStartCdataSectionHandler; |
endCdataSectionHandler = oldEndCdataSectionHandler; |
defaultHandler = oldDefaultHandler; |
unparsedEntityDeclHandler = oldUnparsedEntityDeclHandler; |
notationDeclHandler = oldNotationDeclHandler; |
startNamespaceDeclHandler = oldStartNamespaceDeclHandler; |
endNamespaceDeclHandler = oldEndNamespaceDeclHandler; |
notStandaloneHandler = oldNotStandaloneHandler; |
externalEntityRefHandler = oldExternalEntityRefHandler; |
skippedEntityHandler = oldSkippedEntityHandler; |
unknownEncodingHandler = oldUnknownEncodingHandler; |
elementDeclHandler = oldElementDeclHandler; |
attlistDeclHandler = oldAttlistDeclHandler; |
entityDeclHandler = oldEntityDeclHandler; |
xmlDeclHandler = oldXmlDeclHandler; |
declElementType = oldDeclElementType; |
userData = oldUserData; |
if (oldUserData == oldHandlerArg) |
handlerArg = userData; |
else |
handlerArg = parser; |
if (oldExternalEntityRefHandlerArg != oldParser) |
externalEntityRefHandlerArg = oldExternalEntityRefHandlerArg; |
defaultExpandInternalEntities = oldDefaultExpandInternalEntities; |
ns_triplets = oldns_triplets; |
hash_secret_salt = oldhash_secret_salt; |
parentParser = oldParser; |
#ifdef XML_DTD |
paramEntityParsing = oldParamEntityParsing; |
prologState.inEntityValue = oldInEntityValue; |
if (context) { |
#endif /* XML_DTD */ |
if (!dtdCopy(oldParser, _dtd, oldDtd, &parser->m_mem) |
|| !setContext(parser, context)) { |
XML_ParserFree(parser); |
return NULL; |
} |
processor = externalEntityInitProcessor; |
#ifdef XML_DTD |
} |
else { |
/* The DTD instance referenced by _dtd is shared between the document's |
root parser and external PE parsers, therefore one does not need to |
call setContext. In addition, one also *must* not call setContext, |
because this would overwrite existing prefix->binding pointers in |
_dtd with ones that get destroyed with the external PE parser. |
This would leave those prefixes with dangling pointers. |
*/ |
isParamEntity = XML_TRUE; |
XmlPrologStateInitExternalEntity(&prologState); |
processor = externalParEntInitProcessor; |
} |
#endif /* XML_DTD */ |
return parser; |
} |
static void FASTCALL |
destroyBindings(BINDING *bindings, XML_Parser parser) |
{ |
for (;;) { |
BINDING *b = bindings; |
if (!b) |
break; |
bindings = b->nextTagBinding; |
FREE(b->uri); |
FREE(b); |
} |
} |
void XMLCALL |
XML_ParserFree(XML_Parser parser) |
{ |
TAG *tagList; |
OPEN_INTERNAL_ENTITY *entityList; |
if (parser == NULL) |
return; |
/* free tagStack and freeTagList */ |
tagList = tagStack; |
for (;;) { |
TAG *p; |
if (tagList == NULL) { |
if (freeTagList == NULL) |
break; |
tagList = freeTagList; |
freeTagList = NULL; |
} |
p = tagList; |
tagList = tagList->parent; |
FREE(p->buf); |
destroyBindings(p->bindings, parser); |
FREE(p); |
} |
/* free openInternalEntities and freeInternalEntities */ |
entityList = openInternalEntities; |
for (;;) { |
OPEN_INTERNAL_ENTITY *openEntity; |
if (entityList == NULL) { |
if (freeInternalEntities == NULL) |
break; |
entityList = freeInternalEntities; |
freeInternalEntities = NULL; |
} |
openEntity = entityList; |
entityList = entityList->next; |
FREE(openEntity); |
} |
destroyBindings(freeBindingList, parser); |
destroyBindings(inheritedBindings, parser); |
poolDestroy(&tempPool); |
poolDestroy(&temp2Pool); |
#ifdef XML_DTD |
/* external parameter entity parsers share the DTD structure |
parser->m_dtd with the root parser, so we must not destroy it |
*/ |
if (!isParamEntity && _dtd) |
#else |
if (_dtd) |
#endif /* XML_DTD */ |
dtdDestroy(_dtd, (XML_Bool)!parentParser, &parser->m_mem); |
FREE((void *)atts); |
#ifdef XML_ATTR_INFO |
FREE((void *)attInfo); |
#endif |
FREE(groupConnector); |
FREE(buffer); |
FREE(dataBuf); |
FREE(nsAtts); |
FREE(unknownEncodingMem); |
if (unknownEncodingRelease) |
unknownEncodingRelease(unknownEncodingData); |
FREE(parser); |
} |
void XMLCALL |
XML_UseParserAsHandlerArg(XML_Parser parser) |
{ |
handlerArg = parser; |
} |
enum XML_Error XMLCALL |
XML_UseForeignDTD(XML_Parser parser, XML_Bool useDTD) |
{ |
#ifdef XML_DTD |
/* block after XML_Parse()/XML_ParseBuffer() has been called */ |
if (ps_parsing == XML_PARSING || ps_parsing == XML_SUSPENDED) |
return XML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSING; |
useForeignDTD = useDTD; |
return XML_ERROR_NONE; |
#else |
return XML_ERROR_FEATURE_REQUIRES_XML_DTD; |
#endif |
} |
void XMLCALL |
XML_SetReturnNSTriplet(XML_Parser parser, int do_nst) |
{ |
/* block after XML_Parse()/XML_ParseBuffer() has been called */ |
if (ps_parsing == XML_PARSING || ps_parsing == XML_SUSPENDED) |
return; |
ns_triplets = do_nst ? XML_TRUE : XML_FALSE; |
} |
void XMLCALL |
XML_SetUserData(XML_Parser parser, void *p) |
{ |
if (handlerArg == userData) |
handlerArg = userData = p; |
else |
userData = p; |
} |
enum XML_Status XMLCALL |
XML_SetBase(XML_Parser parser, const XML_Char *p) |
{ |
if (p) { |
p = poolCopyString(&_dtd->pool, p); |
if (!p) |
return XML_STATUS_ERROR; |
curBase = p; |
} |
else |
curBase = NULL; |
return XML_STATUS_OK; |
} |
const XML_Char * XMLCALL |
XML_GetBase(XML_Parser parser) |
{ |
return curBase; |
} |
int XMLCALL |
XML_GetSpecifiedAttributeCount(XML_Parser parser) |
{ |
return nSpecifiedAtts; |
} |
int XMLCALL |
XML_GetIdAttributeIndex(XML_Parser parser) |
{ |
return idAttIndex; |
} |
#ifdef XML_ATTR_INFO |
const XML_AttrInfo * XMLCALL |
XML_GetAttributeInfo(XML_Parser parser) |
{ |
return attInfo; |
} |
#endif |
void XMLCALL |
XML_SetElementHandler(XML_Parser parser, |
XML_StartElementHandler start, |
XML_EndElementHandler end) |
{ |
startElementHandler = start; |
endElementHandler = end; |
} |
void XMLCALL |
XML_SetStartElementHandler(XML_Parser parser, |
XML_StartElementHandler start) { |
startElementHandler = start; |
} |
void XMLCALL |
XML_SetEndElementHandler(XML_Parser parser, |
XML_EndElementHandler end) { |
endElementHandler = end; |
} |
void XMLCALL |
XML_SetCharacterDataHandler(XML_Parser parser, |
XML_CharacterDataHandler handler) |
{ |
characterDataHandler = handler; |
} |
void XMLCALL |
XML_SetProcessingInstructionHandler(XML_Parser parser, |
XML_ProcessingInstructionHandler handler) |
{ |
processingInstructionHandler = handler; |
} |
void XMLCALL |
XML_SetCommentHandler(XML_Parser parser, |
XML_CommentHandler handler) |
{ |
commentHandler = handler; |
} |
void XMLCALL |
XML_SetCdataSectionHandler(XML_Parser parser, |
XML_StartCdataSectionHandler start, |
XML_EndCdataSectionHandler end) |
{ |
startCdataSectionHandler = start; |
endCdataSectionHandler = end; |
} |
void XMLCALL |
XML_SetStartCdataSectionHandler(XML_Parser parser, |
XML_StartCdataSectionHandler start) { |
startCdataSectionHandler = start; |
} |
void XMLCALL |
XML_SetEndCdataSectionHandler(XML_Parser parser, |
XML_EndCdataSectionHandler end) { |
endCdataSectionHandler = end; |
} |
void XMLCALL |
XML_SetDefaultHandler(XML_Parser parser, |
XML_DefaultHandler handler) |
{ |
defaultHandler = handler; |
defaultExpandInternalEntities = XML_FALSE; |
} |
void XMLCALL |
XML_SetDefaultHandlerExpand(XML_Parser parser, |
XML_DefaultHandler handler) |
{ |
defaultHandler = handler; |
defaultExpandInternalEntities = XML_TRUE; |
} |
void XMLCALL |
XML_SetDoctypeDeclHandler(XML_Parser parser, |
XML_StartDoctypeDeclHandler start, |
XML_EndDoctypeDeclHandler end) |
{ |
startDoctypeDeclHandler = start; |
endDoctypeDeclHandler = end; |
} |
void XMLCALL |
XML_SetStartDoctypeDeclHandler(XML_Parser parser, |
XML_StartDoctypeDeclHandler start) { |
startDoctypeDeclHandler = start; |
} |
void XMLCALL |
XML_SetEndDoctypeDeclHandler(XML_Parser parser, |
XML_EndDoctypeDeclHandler end) { |
endDoctypeDeclHandler = end; |
} |
void XMLCALL |
XML_SetUnparsedEntityDeclHandler(XML_Parser parser, |
XML_UnparsedEntityDeclHandler handler) |
{ |
unparsedEntityDeclHandler = handler; |
} |
void XMLCALL |
XML_SetNotationDeclHandler(XML_Parser parser, |
XML_NotationDeclHandler handler) |
{ |
notationDeclHandler = handler; |
} |
void XMLCALL |
XML_SetNamespaceDeclHandler(XML_Parser parser, |
XML_StartNamespaceDeclHandler start, |
XML_EndNamespaceDeclHandler end) |
{ |
startNamespaceDeclHandler = start; |
endNamespaceDeclHandler = end; |
} |
void XMLCALL |
XML_SetStartNamespaceDeclHandler(XML_Parser parser, |
XML_StartNamespaceDeclHandler start) { |
startNamespaceDeclHandler = start; |
} |
void XMLCALL |
XML_SetEndNamespaceDeclHandler(XML_Parser parser, |
XML_EndNamespaceDeclHandler end) { |
endNamespaceDeclHandler = end; |
} |
void XMLCALL |
XML_SetNotStandaloneHandler(XML_Parser parser, |
XML_NotStandaloneHandler handler) |
{ |
notStandaloneHandler = handler; |
} |
void XMLCALL |
XML_SetExternalEntityRefHandler(XML_Parser parser, |
XML_ExternalEntityRefHandler handler) |
{ |
externalEntityRefHandler = handler; |
} |
void XMLCALL |
XML_SetExternalEntityRefHandlerArg(XML_Parser parser, void *arg) |
{ |
if (arg) |
externalEntityRefHandlerArg = (XML_Parser)arg; |
else |
externalEntityRefHandlerArg = parser; |
} |
void XMLCALL |
XML_SetSkippedEntityHandler(XML_Parser parser, |
XML_SkippedEntityHandler handler) |
{ |
skippedEntityHandler = handler; |
} |
void XMLCALL |
XML_SetUnknownEncodingHandler(XML_Parser parser, |
XML_UnknownEncodingHandler handler, |
void *data) |
{ |
unknownEncodingHandler = handler; |
unknownEncodingHandlerData = data; |
} |
void XMLCALL |
XML_SetElementDeclHandler(XML_Parser parser, |
XML_ElementDeclHandler eldecl) |
{ |
elementDeclHandler = eldecl; |
} |
void XMLCALL |
XML_SetAttlistDeclHandler(XML_Parser parser, |
XML_AttlistDeclHandler attdecl) |
{ |
attlistDeclHandler = attdecl; |
} |
void XMLCALL |
XML_SetEntityDeclHandler(XML_Parser parser, |
XML_EntityDeclHandler handler) |
{ |
entityDeclHandler = handler; |
} |
void XMLCALL |
XML_SetXmlDeclHandler(XML_Parser parser, |
XML_XmlDeclHandler handler) { |
xmlDeclHandler = handler; |
} |
int XMLCALL |
XML_SetParamEntityParsing(XML_Parser parser, |
enum XML_ParamEntityParsing peParsing) |
{ |
/* block after XML_Parse()/XML_ParseBuffer() has been called */ |
if (ps_parsing == XML_PARSING || ps_parsing == XML_SUSPENDED) |
return 0; |
#ifdef XML_DTD |
paramEntityParsing = peParsing; |
return 1; |
#else |
return peParsing == XML_PARAM_ENTITY_PARSING_NEVER; |
#endif |
} |
int XMLCALL |
XML_SetHashSalt(XML_Parser parser, |
unsigned long hash_salt) |
{ |
/* block after XML_Parse()/XML_ParseBuffer() has been called */ |
if (ps_parsing == XML_PARSING || ps_parsing == XML_SUSPENDED) |
return 0; |
hash_secret_salt = hash_salt; |
return 1; |
} |
enum XML_Status XMLCALL |
XML_Parse(XML_Parser parser, const char *s, int len, int isFinal) |
{ |
switch (ps_parsing) { |
case XML_SUSPENDED: |
errorCode = XML_ERROR_SUSPENDED; |
return XML_STATUS_ERROR; |
case XML_FINISHED: |
errorCode = XML_ERROR_FINISHED; |
return XML_STATUS_ERROR; |
case XML_INITIALIZED: |
if (parentParser == NULL && !startParsing(parser)) { |
errorCode = XML_ERROR_NO_MEMORY; |
return XML_STATUS_ERROR; |
} |
default: |
ps_parsing = XML_PARSING; |
} |
if (len == 0) { |
ps_finalBuffer = (XML_Bool)isFinal; |
if (!isFinal) |
return XML_STATUS_OK; |
positionPtr = bufferPtr; |
parseEndPtr = bufferEnd; |
/* If data are left over from last buffer, and we now know that these |
data are the final chunk of input, then we have to check them again |
to detect errors based on that fact. |
*/ |
errorCode = processor(parser, bufferPtr, parseEndPtr, &bufferPtr); |
if (errorCode == XML_ERROR_NONE) { |
switch (ps_parsing) { |
case XML_SUSPENDED: |
XmlUpdatePosition(encoding, positionPtr, bufferPtr, &position); |
positionPtr = bufferPtr; |
return XML_STATUS_SUSPENDED; |
case XML_INITIALIZED: |
case XML_PARSING: |
ps_parsing = XML_FINISHED; |
/* fall through */ |
default: |
return XML_STATUS_OK; |
} |
} |
eventEndPtr = eventPtr; |
processor = errorProcessor; |
return XML_STATUS_ERROR; |
} |
#ifndef XML_CONTEXT_BYTES |
else if (bufferPtr == bufferEnd) { |
const char *end; |
int nLeftOver; |
enum XML_Error result; |
parseEndByteIndex += len; |
positionPtr = s; |
ps_finalBuffer = (XML_Bool)isFinal; |
errorCode = processor(parser, s, parseEndPtr = s + len, &end); |
if (errorCode != XML_ERROR_NONE) { |
eventEndPtr = eventPtr; |
processor = errorProcessor; |
return XML_STATUS_ERROR; |
} |
else { |
switch (ps_parsing) { |
case XML_SUSPENDED: |
result = XML_STATUS_SUSPENDED; |
break; |
case XML_INITIALIZED: |
case XML_PARSING: |
if (isFinal) { |
ps_parsing = XML_FINISHED; |
return XML_STATUS_OK; |
} |
/* fall through */ |
default: |
result = XML_STATUS_OK; |
} |
} |
XmlUpdatePosition(encoding, positionPtr, end, &position); |
nLeftOver = s + len - end; |
if (nLeftOver) { |
if (buffer == NULL || nLeftOver > bufferLim - buffer) { |
/* FIXME avoid integer overflow */ |
char *temp; |
temp = (buffer == NULL |
? (char *)MALLOC(len * 2) |
: (char *)REALLOC(buffer, len * 2)); |
if (temp == NULL) { |
errorCode = XML_ERROR_NO_MEMORY; |
eventPtr = eventEndPtr = NULL; |
processor = errorProcessor; |
return XML_STATUS_ERROR; |
} |
buffer = temp; |
bufferLim = buffer + len * 2; |
} |
memcpy(buffer, end, nLeftOver); |
} |
bufferPtr = buffer; |
bufferEnd = buffer + nLeftOver; |
positionPtr = bufferPtr; |
parseEndPtr = bufferEnd; |
eventPtr = bufferPtr; |
eventEndPtr = bufferPtr; |
return result; |
} |
#endif /* not defined XML_CONTEXT_BYTES */ |
else { |
void *buff = XML_GetBuffer(parser, len); |
if (buff == NULL) |
return XML_STATUS_ERROR; |
else { |
memcpy(buff, s, len); |
return XML_ParseBuffer(parser, len, isFinal); |
} |
} |
} |
enum XML_Status XMLCALL |
XML_ParseBuffer(XML_Parser parser, int len, int isFinal) |
{ |
const char *start; |
enum XML_Status result = XML_STATUS_OK; |
switch (ps_parsing) { |
case XML_SUSPENDED: |
errorCode = XML_ERROR_SUSPENDED; |
return XML_STATUS_ERROR; |
case XML_FINISHED: |
errorCode = XML_ERROR_FINISHED; |
return XML_STATUS_ERROR; |
case XML_INITIALIZED: |
if (parentParser == NULL && !startParsing(parser)) { |
errorCode = XML_ERROR_NO_MEMORY; |
return XML_STATUS_ERROR; |
} |
default: |
ps_parsing = XML_PARSING; |
} |
start = bufferPtr; |
positionPtr = start; |
bufferEnd += len; |
parseEndPtr = bufferEnd; |
parseEndByteIndex += len; |
ps_finalBuffer = (XML_Bool)isFinal; |
errorCode = processor(parser, start, parseEndPtr, &bufferPtr); |
if (errorCode != XML_ERROR_NONE) { |
eventEndPtr = eventPtr; |
processor = errorProcessor; |
return XML_STATUS_ERROR; |
} |
else { |
switch (ps_parsing) { |
case XML_SUSPENDED: |
result = XML_STATUS_SUSPENDED; |
break; |
case XML_INITIALIZED: |
case XML_PARSING: |
if (isFinal) { |
ps_parsing = XML_FINISHED; |
return result; |
} |
default: ; /* should not happen */ |
} |
} |
XmlUpdatePosition(encoding, positionPtr, bufferPtr, &position); |
positionPtr = bufferPtr; |
return result; |
} |
void * XMLCALL |
XML_GetBuffer(XML_Parser parser, int len) |
{ |
switch (ps_parsing) { |
case XML_SUSPENDED: |
errorCode = XML_ERROR_SUSPENDED; |
return NULL; |
case XML_FINISHED: |
errorCode = XML_ERROR_FINISHED; |
return NULL; |
default: ; |
} |
if (len > bufferLim - bufferEnd) { |
/* FIXME avoid integer overflow */ |
int neededSize = len + (int)(bufferEnd - bufferPtr); |
#ifdef XML_CONTEXT_BYTES |
int keep = (int)(bufferPtr - buffer); |
if (keep > XML_CONTEXT_BYTES) |
keep = XML_CONTEXT_BYTES; |
neededSize += keep; |
#endif /* defined XML_CONTEXT_BYTES */ |
if (neededSize <= bufferLim - buffer) { |
#ifdef XML_CONTEXT_BYTES |
if (keep < bufferPtr - buffer) { |
int offset = (int)(bufferPtr - buffer) - keep; |
memmove(buffer, &buffer[offset], bufferEnd - bufferPtr + keep); |
bufferEnd -= offset; |
bufferPtr -= offset; |
} |
#else |
memmove(buffer, bufferPtr, bufferEnd - bufferPtr); |
bufferEnd = buffer + (bufferEnd - bufferPtr); |
bufferPtr = buffer; |
#endif /* not defined XML_CONTEXT_BYTES */ |
} |
else { |
char *newBuf; |
int bufferSize = (int)(bufferLim - bufferPtr); |
if (bufferSize == 0) |
bufferSize = INIT_BUFFER_SIZE; |
do { |
bufferSize *= 2; |
} while (bufferSize < neededSize); |
newBuf = (char *)MALLOC(bufferSize); |
if (newBuf == 0) { |
errorCode = XML_ERROR_NO_MEMORY; |
return NULL; |
} |
bufferLim = newBuf + bufferSize; |
#ifdef XML_CONTEXT_BYTES |
if (bufferPtr) { |
int keep = (int)(bufferPtr - buffer); |
if (keep > XML_CONTEXT_BYTES) |
keep = XML_CONTEXT_BYTES; |
memcpy(newBuf, &bufferPtr[-keep], bufferEnd - bufferPtr + keep); |
FREE(buffer); |
buffer = newBuf; |
bufferEnd = buffer + (bufferEnd - bufferPtr) + keep; |
bufferPtr = buffer + keep; |
} |
else { |
bufferEnd = newBuf + (bufferEnd - bufferPtr); |
bufferPtr = buffer = newBuf; |
} |
#else |
if (bufferPtr) { |
memcpy(newBuf, bufferPtr, bufferEnd - bufferPtr); |
FREE(buffer); |
} |
bufferEnd = newBuf + (bufferEnd - bufferPtr); |
bufferPtr = buffer = newBuf; |
#endif /* not defined XML_CONTEXT_BYTES */ |
} |
eventPtr = eventEndPtr = NULL; |
positionPtr = NULL; |
} |
return bufferEnd; |
} |
enum XML_Status XMLCALL |
XML_StopParser(XML_Parser parser, XML_Bool resumable) |
{ |
switch (ps_parsing) { |
case XML_SUSPENDED: |
if (resumable) { |
errorCode = XML_ERROR_SUSPENDED; |
return XML_STATUS_ERROR; |
} |
ps_parsing = XML_FINISHED; |
break; |
case XML_FINISHED: |
errorCode = XML_ERROR_FINISHED; |
return XML_STATUS_ERROR; |
default: |
if (resumable) { |
#ifdef XML_DTD |
if (isParamEntity) { |
errorCode = XML_ERROR_SUSPEND_PE; |
return XML_STATUS_ERROR; |
} |
#endif |
ps_parsing = XML_SUSPENDED; |
} |
else |
ps_parsing = XML_FINISHED; |
} |
return XML_STATUS_OK; |
} |
enum XML_Status XMLCALL |
XML_ResumeParser(XML_Parser parser) |
{ |
enum XML_Status result = XML_STATUS_OK; |
if (ps_parsing != XML_SUSPENDED) { |
errorCode = XML_ERROR_NOT_SUSPENDED; |
return XML_STATUS_ERROR; |
} |
ps_parsing = XML_PARSING; |
errorCode = processor(parser, bufferPtr, parseEndPtr, &bufferPtr); |
if (errorCode != XML_ERROR_NONE) { |
eventEndPtr = eventPtr; |
processor = errorProcessor; |
return XML_STATUS_ERROR; |
} |
else { |
switch (ps_parsing) { |
case XML_SUSPENDED: |
result = XML_STATUS_SUSPENDED; |
break; |
case XML_INITIALIZED: |
case XML_PARSING: |
if (ps_finalBuffer) { |
ps_parsing = XML_FINISHED; |
return result; |
} |
default: ; |
} |
} |
XmlUpdatePosition(encoding, positionPtr, bufferPtr, &position); |
positionPtr = bufferPtr; |
return result; |
} |
void XMLCALL |
XML_GetParsingStatus(XML_Parser parser, XML_ParsingStatus *status) |
{ |
assert(status != NULL); |
*status = parser->m_parsingStatus; |
} |
enum XML_Error XMLCALL |
XML_GetErrorCode(XML_Parser parser) |
{ |
return errorCode; |
} |
XML_Index XMLCALL |
XML_GetCurrentByteIndex(XML_Parser parser) |
{ |
if (eventPtr) |
return parseEndByteIndex - (parseEndPtr - eventPtr); |
return -1; |
} |
int XMLCALL |
XML_GetCurrentByteCount(XML_Parser parser) |
{ |
if (eventEndPtr && eventPtr) |
return (int)(eventEndPtr - eventPtr); |
return 0; |
} |
const char * XMLCALL |
XML_GetInputContext(XML_Parser parser, int *offset, int *size) |
{ |
#ifdef XML_CONTEXT_BYTES |
if (eventPtr && buffer) { |
*offset = (int)(eventPtr - buffer); |
*size = (int)(bufferEnd - buffer); |
return buffer; |
} |
#endif /* defined XML_CONTEXT_BYTES */ |
return (char *) 0; |
} |
XML_Size XMLCALL |
XML_GetCurrentLineNumber(XML_Parser parser) |
{ |
if (eventPtr && eventPtr >= positionPtr) { |
XmlUpdatePosition(encoding, positionPtr, eventPtr, &position); |
positionPtr = eventPtr; |
} |
return position.lineNumber + 1; |
} |
XML_Size XMLCALL |
XML_GetCurrentColumnNumber(XML_Parser parser) |
{ |
if (eventPtr && eventPtr >= positionPtr) { |
XmlUpdatePosition(encoding, positionPtr, eventPtr, &position); |
positionPtr = eventPtr; |
} |
return position.columnNumber; |
} |
void XMLCALL |
XML_FreeContentModel(XML_Parser parser, XML_Content *model) |
{ |
FREE(model); |
} |
void * XMLCALL |
XML_MemMalloc(XML_Parser parser, size_t size) |
{ |
return MALLOC(size); |
} |
void * XMLCALL |
XML_MemRealloc(XML_Parser parser, void *ptr, size_t size) |
{ |
return REALLOC(ptr, size); |
} |
void XMLCALL |
XML_MemFree(XML_Parser parser, void *ptr) |
{ |
FREE(ptr); |
} |
void XMLCALL |
XML_DefaultCurrent(XML_Parser parser) |
{ |
if (defaultHandler) { |
if (openInternalEntities) |
reportDefault(parser, |
internalEncoding, |
openInternalEntities->internalEventPtr, |
openInternalEntities->internalEventEndPtr); |
else |
reportDefault(parser, encoding, eventPtr, eventEndPtr); |
} |
} |
const XML_LChar * XMLCALL |
XML_ErrorString(enum XML_Error code) |
{ |
static const XML_LChar* const message[] = { |
0, |
XML_L("out of memory"), |
XML_L("syntax error"), |
XML_L("no element found"), |
XML_L("not well-formed (invalid token)"), |
XML_L("unclosed token"), |
XML_L("partial character"), |
XML_L("mismatched tag"), |
XML_L("duplicate attribute"), |
XML_L("junk after document element"), |
XML_L("illegal parameter entity reference"), |
XML_L("undefined entity"), |
XML_L("recursive entity reference"), |
XML_L("asynchronous entity"), |
XML_L("reference to invalid character number"), |
XML_L("reference to binary entity"), |
XML_L("reference to external entity in attribute"), |
XML_L("XML or text declaration not at start of entity"), |
XML_L("unknown encoding"), |
XML_L("encoding specified in XML declaration is incorrect"), |
XML_L("unclosed CDATA section"), |
XML_L("error in processing external entity reference"), |
XML_L("document is not standalone"), |
XML_L("unexpected parser state - please send a bug report"), |
XML_L("entity declared in parameter entity"), |
XML_L("requested feature requires XML_DTD support in Expat"), |
XML_L("cannot change setting once parsing has begun"), |
XML_L("unbound prefix"), |
XML_L("must not undeclare prefix"), |
XML_L("incomplete markup in parameter entity"), |
XML_L("XML declaration not well-formed"), |
XML_L("text declaration not well-formed"), |
XML_L("illegal character(s) in public id"), |
XML_L("parser suspended"), |
XML_L("parser not suspended"), |
XML_L("parsing aborted"), |
XML_L("parsing finished"), |
XML_L("cannot suspend in external parameter entity"), |
XML_L("reserved prefix (xml) must not be undeclared or bound to another namespace name"), |
XML_L("reserved prefix (xmlns) must not be declared or undeclared"), |
XML_L("prefix must not be bound to one of the reserved namespace names") |
}; |
if (code > 0 && code < sizeof(message)/sizeof(message[0])) |
return message[code]; |
return NULL; |
} |
const XML_LChar * XMLCALL |
XML_ExpatVersion(void) { |
/* V1 is used to string-ize the version number. However, it would |
string-ize the actual version macro *names* unless we get them |
substituted before being passed to V1. CPP is defined to expand |
a macro, then rescan for more expansions. Thus, we use V2 to expand |
the version macros, then CPP will expand the resulting V1() macro |
with the correct numerals. */ |
/* ### I'm assuming cpp is portable in this respect... */ |
#define V1(a,b,c) XML_L(#a)XML_L(".")XML_L(#b)XML_L(".")XML_L(#c) |
#define V2(a,b,c) XML_L("expat_")V1(a,b,c) |
return V2(XML_MAJOR_VERSION, XML_MINOR_VERSION, XML_MICRO_VERSION); |
#undef V1 |
#undef V2 |
} |
XML_Expat_Version XMLCALL |
XML_ExpatVersionInfo(void) |
{ |
XML_Expat_Version version; |
version.major = XML_MAJOR_VERSION; |
version.minor = XML_MINOR_VERSION; |
version.micro = XML_MICRO_VERSION; |
return version; |
} |
const XML_Feature * XMLCALL |
XML_GetFeatureList(void) |
{ |
static const XML_Feature features[] = { |
{XML_FEATURE_SIZEOF_XML_CHAR, XML_L("sizeof(XML_Char)"), |
sizeof(XML_Char)}, |
{XML_FEATURE_SIZEOF_XML_LCHAR, XML_L("sizeof(XML_LChar)"), |
sizeof(XML_LChar)}, |
#ifdef XML_UNICODE |
{XML_FEATURE_UNICODE, XML_L("XML_UNICODE"), 0}, |
#endif |
#ifdef XML_UNICODE_WCHAR_T |
{XML_FEATURE_UNICODE_WCHAR_T, XML_L("XML_UNICODE_WCHAR_T"), 0}, |
#endif |
#ifdef XML_DTD |
{XML_FEATURE_DTD, XML_L("XML_DTD"), 0}, |
#endif |
#ifdef XML_CONTEXT_BYTES |
{XML_FEATURE_CONTEXT_BYTES, XML_L("XML_CONTEXT_BYTES"), |
XML_CONTEXT_BYTES}, |
#endif |
#ifdef XML_MIN_SIZE |
{XML_FEATURE_MIN_SIZE, XML_L("XML_MIN_SIZE"), 0}, |
#endif |
#ifdef XML_NS |
{XML_FEATURE_NS, XML_L("XML_NS"), 0}, |
#endif |
#ifdef XML_LARGE_SIZE |
{XML_FEATURE_LARGE_SIZE, XML_L("XML_LARGE_SIZE"), 0}, |
#endif |
#ifdef XML_ATTR_INFO |
{XML_FEATURE_ATTR_INFO, XML_L("XML_ATTR_INFO"), 0}, |
#endif |
{XML_FEATURE_END, NULL, 0} |
}; |
return features; |
} |
/* Initially tag->rawName always points into the parse buffer; |
for those TAG instances opened while the current parse buffer was |
processed, and not yet closed, we need to store tag->rawName in a more |
permanent location, since the parse buffer is about to be discarded. |
*/ |
static XML_Bool |
storeRawNames(XML_Parser parser) |
{ |
TAG *tag = tagStack; |
while (tag) { |
int bufSize; |
int nameLen = sizeof(XML_Char) * (tag->name.strLen + 1); |
char *rawNameBuf = tag->buf + nameLen; |
/* Stop if already stored. Since tagStack is a stack, we can stop |
at the first entry that has already been copied; everything |
below it in the stack is already been accounted for in a |
previous call to this function. |
*/ |
if (tag->rawName == rawNameBuf) |
break; |
/* For re-use purposes we need to ensure that the |
size of tag->buf is a multiple of sizeof(XML_Char). |
*/ |
bufSize = nameLen + ROUND_UP(tag->rawNameLength, sizeof(XML_Char)); |
if (bufSize > tag->bufEnd - tag->buf) { |
char *temp = (char *)REALLOC(tag->buf, bufSize); |
if (temp == NULL) |
return XML_FALSE; |
/* if tag->name.str points to tag->buf (only when namespace |
processing is off) then we have to update it |
*/ |
if (tag->name.str == (XML_Char *)tag->buf) |
tag->name.str = (XML_Char *)temp; |
/* if tag->name.localPart is set (when namespace processing is on) |
then update it as well, since it will always point into tag->buf |
*/ |
if (tag->name.localPart) |
tag->name.localPart = (XML_Char *)temp + (tag->name.localPart - |
(XML_Char *)tag->buf); |
tag->buf = temp; |
tag->bufEnd = temp + bufSize; |
rawNameBuf = temp + nameLen; |
} |
memcpy(rawNameBuf, tag->rawName, tag->rawNameLength); |
tag->rawName = rawNameBuf; |
tag = tag->parent; |
} |
return XML_TRUE; |
} |
static enum XML_Error PTRCALL |
contentProcessor(XML_Parser parser, |
const char *start, |
const char *end, |
const char **endPtr) |
{ |
enum XML_Error result = doContent(parser, 0, encoding, start, end, |
endPtr, (XML_Bool)!ps_finalBuffer); |
if (result == XML_ERROR_NONE) { |
if (!storeRawNames(parser)) |
return XML_ERROR_NO_MEMORY; |
} |
return result; |
} |
static enum XML_Error PTRCALL |
externalEntityInitProcessor(XML_Parser parser, |
const char *start, |
const char *end, |
const char **endPtr) |
{ |
enum XML_Error result = initializeEncoding(parser); |
if (result != XML_ERROR_NONE) |
return result; |
processor = externalEntityInitProcessor2; |
return externalEntityInitProcessor2(parser, start, end, endPtr); |
} |
static enum XML_Error PTRCALL |
externalEntityInitProcessor2(XML_Parser parser, |
const char *start, |
const char *end, |
const char **endPtr) |
{ |
const char *next = start; /* XmlContentTok doesn't always set the last arg */ |
int tok = XmlContentTok(encoding, start, end, &next); |
switch (tok) { |
case XML_TOK_BOM: |
/* If we are at the end of the buffer, this would cause the next stage, |
i.e. externalEntityInitProcessor3, to pass control directly to |
doContent (by detecting XML_TOK_NONE) without processing any xml text |
declaration - causing the error XML_ERROR_MISPLACED_XML_PI in doContent. |
*/ |
if (next == end && !ps_finalBuffer) { |
*endPtr = next; |
return XML_ERROR_NONE; |
} |
start = next; |
break; |
case XML_TOK_PARTIAL: |
if (!ps_finalBuffer) { |
*endPtr = start; |
return XML_ERROR_NONE; |
} |
eventPtr = start; |
return XML_ERROR_UNCLOSED_TOKEN; |
case XML_TOK_PARTIAL_CHAR: |
if (!ps_finalBuffer) { |
*endPtr = start; |
return XML_ERROR_NONE; |
} |
eventPtr = start; |
return XML_ERROR_PARTIAL_CHAR; |
} |
processor = externalEntityInitProcessor3; |
return externalEntityInitProcessor3(parser, start, end, endPtr); |
} |
static enum XML_Error PTRCALL |
externalEntityInitProcessor3(XML_Parser parser, |
const char *start, |
const char *end, |
const char **endPtr) |
{ |
int tok; |
const char *next = start; /* XmlContentTok doesn't always set the last arg */ |
eventPtr = start; |
tok = XmlContentTok(encoding, start, end, &next); |
eventEndPtr = next; |
switch (tok) { |
case XML_TOK_XML_DECL: |
{ |
enum XML_Error result; |
result = processXmlDecl(parser, 1, start, next); |
if (result != XML_ERROR_NONE) |
return result; |
switch (ps_parsing) { |
case XML_SUSPENDED: |
*endPtr = next; |
return XML_ERROR_NONE; |
case XML_FINISHED: |
return XML_ERROR_ABORTED; |
default: |
start = next; |
} |
} |
break; |
case XML_TOK_PARTIAL: |
if (!ps_finalBuffer) { |
*endPtr = start; |
return XML_ERROR_NONE; |
} |
return XML_ERROR_UNCLOSED_TOKEN; |
case XML_TOK_PARTIAL_CHAR: |
if (!ps_finalBuffer) { |
*endPtr = start; |
return XML_ERROR_NONE; |
} |
return XML_ERROR_PARTIAL_CHAR; |
} |
processor = externalEntityContentProcessor; |
tagLevel = 1; |
return externalEntityContentProcessor(parser, start, end, endPtr); |
} |
static enum XML_Error PTRCALL |
externalEntityContentProcessor(XML_Parser parser, |
const char *start, |
const char *end, |
const char **endPtr) |
{ |
enum XML_Error result = doContent(parser, 1, encoding, start, end, |
endPtr, (XML_Bool)!ps_finalBuffer); |
if (result == XML_ERROR_NONE) { |
if (!storeRawNames(parser)) |
return XML_ERROR_NO_MEMORY; |
} |
return result; |
} |
static enum XML_Error |
doContent(XML_Parser parser, |
int startTagLevel, |
const ENCODING *enc, |
const char *s, |
const char *end, |
const char **nextPtr, |
XML_Bool haveMore) |
{ |
/* save one level of indirection */ |
DTD * const dtd = _dtd; |
const char **eventPP; |
const char **eventEndPP; |
if (enc == encoding) { |
eventPP = &eventPtr; |
eventEndPP = &eventEndPtr; |
} |
else { |
eventPP = &(openInternalEntities->internalEventPtr); |
eventEndPP = &(openInternalEntities->internalEventEndPtr); |
} |
*eventPP = s; |
for (;;) { |
const char *next = s; /* XmlContentTok doesn't always set the last arg */ |
int tok = XmlContentTok(enc, s, end, &next); |
*eventEndPP = next; |
switch (tok) { |
case XML_TOK_TRAILING_CR: |
if (haveMore) { |
*nextPtr = s; |
return XML_ERROR_NONE; |
} |
*eventEndPP = end; |
if (characterDataHandler) { |
XML_Char c = 0xA; |
characterDataHandler(handlerArg, &c, 1); |
} |
else if (defaultHandler) |
reportDefault(parser, enc, s, end); |
/* We are at the end of the final buffer, should we check for |
XML_SUSPENDED, XML_FINISHED? |
*/ |
if (startTagLevel == 0) |
return XML_ERROR_NO_ELEMENTS; |
if (tagLevel != startTagLevel) |
return XML_ERROR_ASYNC_ENTITY; |
*nextPtr = end; |
return XML_ERROR_NONE; |
case XML_TOK_NONE: |
if (haveMore) { |
*nextPtr = s; |
return XML_ERROR_NONE; |
} |
if (startTagLevel > 0) { |
if (tagLevel != startTagLevel) |
return XML_ERROR_ASYNC_ENTITY; |
*nextPtr = s; |
return XML_ERROR_NONE; |
} |
return XML_ERROR_NO_ELEMENTS; |
case XML_TOK_INVALID: |
*eventPP = next; |
return XML_ERROR_INVALID_TOKEN; |
case XML_TOK_PARTIAL: |
if (haveMore) { |
*nextPtr = s; |
return XML_ERROR_NONE; |
} |
return XML_ERROR_UNCLOSED_TOKEN; |
case XML_TOK_PARTIAL_CHAR: |
if (haveMore) { |
*nextPtr = s; |
return XML_ERROR_NONE; |
} |
return XML_ERROR_PARTIAL_CHAR; |
case XML_TOK_ENTITY_REF: |
{ |
const XML_Char *name; |
ENTITY *entity; |
XML_Char ch = (XML_Char) XmlPredefinedEntityName(enc, |
s + enc->minBytesPerChar, |
next - enc->minBytesPerChar); |
if (ch) { |
if (characterDataHandler) |
characterDataHandler(handlerArg, &ch, 1); |
else if (defaultHandler) |
reportDefault(parser, enc, s, next); |
break; |
} |
name = poolStoreString(&dtd->pool, enc, |
s + enc->minBytesPerChar, |
next - enc->minBytesPerChar); |
if (!name) |
return XML_ERROR_NO_MEMORY; |
entity = (ENTITY *)lookup(parser, &dtd->generalEntities, name, 0); |
poolDiscard(&dtd->pool); |
/* First, determine if a check for an existing declaration is needed; |
if yes, check that the entity exists, and that it is internal, |
otherwise call the skipped entity or default handler. |
*/ |
if (!dtd->hasParamEntityRefs || dtd->standalone) { |
if (!entity) |
return XML_ERROR_UNDEFINED_ENTITY; |
else if (!entity->is_internal) |
return XML_ERROR_ENTITY_DECLARED_IN_PE; |
} |
else if (!entity) { |
if (skippedEntityHandler) |
skippedEntityHandler(handlerArg, name, 0); |
else if (defaultHandler) |
reportDefault(parser, enc, s, next); |
break; |
} |
if (entity->open) |
return XML_ERROR_RECURSIVE_ENTITY_REF; |
if (entity->notation) |
return XML_ERROR_BINARY_ENTITY_REF; |
if (entity->textPtr) { |
enum XML_Error result; |
if (!defaultExpandInternalEntities) { |
if (skippedEntityHandler) |
skippedEntityHandler(handlerArg, entity->name, 0); |
else if (defaultHandler) |
reportDefault(parser, enc, s, next); |
break; |
} |
result = processInternalEntity(parser, entity, XML_FALSE); |
if (result != XML_ERROR_NONE) |
return result; |
} |
else if (externalEntityRefHandler) { |
const XML_Char *context; |
entity->open = XML_TRUE; |
context = getContext(parser); |
entity->open = XML_FALSE; |
if (!context) |
return XML_ERROR_NO_MEMORY; |
if (!externalEntityRefHandler(externalEntityRefHandlerArg, |
context, |
entity->base, |
entity->systemId, |
entity->publicId)) |
return XML_ERROR_EXTERNAL_ENTITY_HANDLING; |
poolDiscard(&tempPool); |
} |
else if (defaultHandler) |
reportDefault(parser, enc, s, next); |
break; |
} |
case XML_TOK_START_TAG_NO_ATTS: |
/* fall through */ |
case XML_TOK_START_TAG_WITH_ATTS: |
{ |
TAG *tag; |
enum XML_Error result; |
XML_Char *toPtr; |
if (freeTagList) { |
tag = freeTagList; |
freeTagList = freeTagList->parent; |
} |
else { |
tag = (TAG *)MALLOC(sizeof(TAG)); |
if (!tag) |
return XML_ERROR_NO_MEMORY; |
tag->buf = (char *)MALLOC(INIT_TAG_BUF_SIZE); |
if (!tag->buf) { |
FREE(tag); |
return XML_ERROR_NO_MEMORY; |
} |
tag->bufEnd = tag->buf + INIT_TAG_BUF_SIZE; |
} |
tag->bindings = NULL; |
tag->parent = tagStack; |
tagStack = tag; |
tag->name.localPart = NULL; |
tag->name.prefix = NULL; |
tag->rawName = s + enc->minBytesPerChar; |
tag->rawNameLength = XmlNameLength(enc, tag->rawName); |
++tagLevel; |
{ |
const char *rawNameEnd = tag->rawName + tag->rawNameLength; |
const char *fromPtr = tag->rawName; |
toPtr = (XML_Char *)tag->buf; |
for (;;) { |
int bufSize; |
int convLen; |
XmlConvert(enc, |
&fromPtr, rawNameEnd, |
(ICHAR **)&toPtr, (ICHAR *)tag->bufEnd - 1); |
convLen = (int)(toPtr - (XML_Char *)tag->buf); |
if (fromPtr == rawNameEnd) { |
tag->name.strLen = convLen; |
break; |
} |
bufSize = (int)(tag->bufEnd - tag->buf) << 1; |
{ |
char *temp = (char *)REALLOC(tag->buf, bufSize); |
if (temp == NULL) |
return XML_ERROR_NO_MEMORY; |
tag->buf = temp; |
tag->bufEnd = temp + bufSize; |
toPtr = (XML_Char *)temp + convLen; |
} |
} |
} |
tag->name.str = (XML_Char *)tag->buf; |
*toPtr = XML_T('\0'); |
result = storeAtts(parser, enc, s, &(tag->name), &(tag->bindings)); |
if (result) |
return result; |
if (startElementHandler) |
startElementHandler(handlerArg, tag->name.str, |
(const XML_Char **)atts); |
else if (defaultHandler) |
reportDefault(parser, enc, s, next); |
poolClear(&tempPool); |
break; |
} |
case XML_TOK_EMPTY_ELEMENT_NO_ATTS: |
/* fall through */ |
case XML_TOK_EMPTY_ELEMENT_WITH_ATTS: |
{ |
const char *rawName = s + enc->minBytesPerChar; |
enum XML_Error result; |
BINDING *bindings = NULL; |
XML_Bool noElmHandlers = XML_TRUE; |
TAG_NAME name; |
name.str = poolStoreString(&tempPool, enc, rawName, |
rawName + XmlNameLength(enc, rawName)); |
if (!name.str) |
return XML_ERROR_NO_MEMORY; |
poolFinish(&tempPool); |
result = storeAtts(parser, enc, s, &name, &bindings); |
if (result) |
return result; |
poolFinish(&tempPool); |
if (startElementHandler) { |
startElementHandler(handlerArg, name.str, (const XML_Char **)atts); |
noElmHandlers = XML_FALSE; |
} |
if (endElementHandler) { |
if (startElementHandler) |
*eventPP = *eventEndPP; |
endElementHandler(handlerArg, name.str); |
noElmHandlers = XML_FALSE; |
} |
if (noElmHandlers && defaultHandler) |
reportDefault(parser, enc, s, next); |
poolClear(&tempPool); |
while (bindings) { |
BINDING *b = bindings; |
if (endNamespaceDeclHandler) |
endNamespaceDeclHandler(handlerArg, b->prefix->name); |
bindings = bindings->nextTagBinding; |
b->nextTagBinding = freeBindingList; |
freeBindingList = b; |
b->prefix->binding = b->prevPrefixBinding; |
} |
} |
if (tagLevel == 0) |
return epilogProcessor(parser, next, end, nextPtr); |
break; |
case XML_TOK_END_TAG: |
if (tagLevel == startTagLevel) |
return XML_ERROR_ASYNC_ENTITY; |
else { |
int len; |
const char *rawName; |
TAG *tag = tagStack; |
tagStack = tag->parent; |
tag->parent = freeTagList; |
freeTagList = tag; |
rawName = s + enc->minBytesPerChar*2; |
len = XmlNameLength(enc, rawName); |
if (len != tag->rawNameLength |
|| memcmp(tag->rawName, rawName, len) != 0) { |
*eventPP = rawName; |
return XML_ERROR_TAG_MISMATCH; |
} |
--tagLevel; |
if (endElementHandler) { |
const XML_Char *localPart; |
const XML_Char *prefix; |
XML_Char *uri; |
localPart = tag->name.localPart; |
if (ns && localPart) { |
/* localPart and prefix may have been overwritten in |
tag->name.str, since this points to the binding->uri |
buffer which gets re-used; so we have to add them again |
*/ |
uri = (XML_Char *)tag->name.str + tag->name.uriLen; |
/* don't need to check for space - already done in storeAtts() */ |
while (*localPart) *uri++ = *localPart++; |
prefix = (XML_Char *)tag->name.prefix; |
if (ns_triplets && prefix) { |
*uri++ = namespaceSeparator; |
while (*prefix) *uri++ = *prefix++; |
} |
*uri = XML_T('\0'); |
} |
endElementHandler(handlerArg, tag->name.str); |
} |
else if (defaultHandler) |
reportDefault(parser, enc, s, next); |
while (tag->bindings) { |
BINDING *b = tag->bindings; |
if (endNamespaceDeclHandler) |
endNamespaceDeclHandler(handlerArg, b->prefix->name); |
tag->bindings = tag->bindings->nextTagBinding; |
b->nextTagBinding = freeBindingList; |
freeBindingList = b; |
b->prefix->binding = b->prevPrefixBinding; |
} |
if (tagLevel == 0) |
return epilogProcessor(parser, next, end, nextPtr); |
} |
break; |
case XML_TOK_CHAR_REF: |
{ |
int n = XmlCharRefNumber(enc, s); |
if (n < 0) |
return XML_ERROR_BAD_CHAR_REF; |
if (characterDataHandler) { |
XML_Char buf[XML_ENCODE_MAX]; |
characterDataHandler(handlerArg, buf, XmlEncode(n, (ICHAR *)buf)); |
} |
else if (defaultHandler) |
reportDefault(parser, enc, s, next); |
} |
break; |
case XML_TOK_XML_DECL: |
return XML_ERROR_MISPLACED_XML_PI; |
case XML_TOK_DATA_NEWLINE: |
if (characterDataHandler) { |
XML_Char c = 0xA; |
characterDataHandler(handlerArg, &c, 1); |
} |
else if (defaultHandler) |
reportDefault(parser, enc, s, next); |
break; |
case XML_TOK_CDATA_SECT_OPEN: |
{ |
enum XML_Error result; |
if (startCdataSectionHandler) |
startCdataSectionHandler(handlerArg); |
#if 0 |
/* Suppose you doing a transformation on a document that involves |
changing only the character data. You set up a defaultHandler |
and a characterDataHandler. The defaultHandler simply copies |
characters through. The characterDataHandler does the |
transformation and writes the characters out escaping them as |
necessary. This case will fail to work if we leave out the |
following two lines (because & and < inside CDATA sections will |
be incorrectly escaped). |
However, now we have a start/endCdataSectionHandler, so it seems |
easier to let the user deal with this. |
*/ |
else if (characterDataHandler) |
characterDataHandler(handlerArg, dataBuf, 0); |
#endif |
else if (defaultHandler) |
reportDefault(parser, enc, s, next); |
result = doCdataSection(parser, enc, &next, end, nextPtr, haveMore); |
if (result != XML_ERROR_NONE) |
return result; |
else if (!next) { |
processor = cdataSectionProcessor; |
return result; |
} |
} |
break; |
case XML_TOK_TRAILING_RSQB: |
if (haveMore) { |
*nextPtr = s; |
return XML_ERROR_NONE; |
} |
if (characterDataHandler) { |
if (MUST_CONVERT(enc, s)) { |
ICHAR *dataPtr = (ICHAR *)dataBuf; |
XmlConvert(enc, &s, end, &dataPtr, (ICHAR *)dataBufEnd); |
characterDataHandler(handlerArg, dataBuf, |
(int)(dataPtr - (ICHAR *)dataBuf)); |
} |
else |
characterDataHandler(handlerArg, |
(XML_Char *)s, |
(int)((XML_Char *)end - (XML_Char *)s)); |
} |
else if (defaultHandler) |
reportDefault(parser, enc, s, end); |
/* We are at the end of the final buffer, should we check for |
XML_SUSPENDED, XML_FINISHED? |
*/ |
if (startTagLevel == 0) { |
*eventPP = end; |
return XML_ERROR_NO_ELEMENTS; |
} |
if (tagLevel != startTagLevel) { |
*eventPP = end; |
return XML_ERROR_ASYNC_ENTITY; |
} |
*nextPtr = end; |
return XML_ERROR_NONE; |
case XML_TOK_DATA_CHARS: |
{ |
XML_CharacterDataHandler charDataHandler = characterDataHandler; |
if (charDataHandler) { |
if (MUST_CONVERT(enc, s)) { |
for (;;) { |
ICHAR *dataPtr = (ICHAR *)dataBuf; |
XmlConvert(enc, &s, next, &dataPtr, (ICHAR *)dataBufEnd); |
*eventEndPP = s; |
charDataHandler(handlerArg, dataBuf, |
(int)(dataPtr - (ICHAR *)dataBuf)); |
if (s == next) |
break; |
*eventPP = s; |
} |
} |
else |
charDataHandler(handlerArg, |
(XML_Char *)s, |
(int)((XML_Char *)next - (XML_Char *)s)); |
} |
else if (defaultHandler) |
reportDefault(parser, enc, s, next); |
} |
break; |
case XML_TOK_PI: |
if (!reportProcessingInstruction(parser, enc, s, next)) |
return XML_ERROR_NO_MEMORY; |
break; |
case XML_TOK_COMMENT: |
if (!reportComment(parser, enc, s, next)) |
return XML_ERROR_NO_MEMORY; |
break; |
default: |
if (defaultHandler) |
reportDefault(parser, enc, s, next); |
break; |
} |
*eventPP = s = next; |
switch (ps_parsing) { |
case XML_SUSPENDED: |
*nextPtr = next; |
return XML_ERROR_NONE; |
case XML_FINISHED: |
return XML_ERROR_ABORTED; |
default: ; |
} |
} |
/* not reached */ |
} |
/* Precondition: all arguments must be non-NULL; |
Purpose: |
- normalize attributes |
- check attributes for well-formedness |
- generate namespace aware attribute names (URI, prefix) |
- build list of attributes for startElementHandler |
- default attributes |
- process namespace declarations (check and report them) |
- generate namespace aware element name (URI, prefix) |
*/ |
static enum XML_Error |
storeAtts(XML_Parser parser, const ENCODING *enc, |
const char *attStr, TAG_NAME *tagNamePtr, |
BINDING **bindingsPtr) |
{ |
DTD * const dtd = _dtd; /* save one level of indirection */ |
ELEMENT_TYPE *elementType; |
int nDefaultAtts; |
const XML_Char **appAtts; /* the attribute list for the application */ |
int attIndex = 0; |
int prefixLen; |
int i; |
int n; |
XML_Char *uri; |
int nPrefixes = 0; |
BINDING *binding; |
const XML_Char *localPart; |
/* lookup the element type name */ |
elementType = (ELEMENT_TYPE *)lookup(parser, &dtd->elementTypes, tagNamePtr->str,0); |
if (!elementType) { |
const XML_Char *name = poolCopyString(&dtd->pool, tagNamePtr->str); |
if (!name) |
return XML_ERROR_NO_MEMORY; |
elementType = (ELEMENT_TYPE *)lookup(parser, &dtd->elementTypes, name, |
sizeof(ELEMENT_TYPE)); |
if (!elementType) |
return XML_ERROR_NO_MEMORY; |
if (ns && !setElementTypePrefix(parser, elementType)) |
return XML_ERROR_NO_MEMORY; |
} |
nDefaultAtts = elementType->nDefaultAtts; |
/* get the attributes from the tokenizer */ |
n = XmlGetAttributes(enc, attStr, attsSize, atts); |
if (n + nDefaultAtts > attsSize) { |
int oldAttsSize = attsSize; |
ATTRIBUTE *temp; |
#ifdef XML_ATTR_INFO |
XML_AttrInfo *temp2; |
#endif |
attsSize = n + nDefaultAtts + INIT_ATTS_SIZE; |
temp = (ATTRIBUTE *)REALLOC((void *)atts, attsSize * sizeof(ATTRIBUTE)); |
if (temp == NULL) |
return XML_ERROR_NO_MEMORY; |
atts = temp; |
#ifdef XML_ATTR_INFO |
temp2 = (XML_AttrInfo *)REALLOC((void *)attInfo, attsSize * sizeof(XML_AttrInfo)); |
if (temp2 == NULL) |
return XML_ERROR_NO_MEMORY; |
attInfo = temp2; |
#endif |
if (n > oldAttsSize) |
XmlGetAttributes(enc, attStr, n, atts); |
} |
appAtts = (const XML_Char **)atts; |
for (i = 0; i < n; i++) { |
ATTRIBUTE *currAtt = &atts[i]; |
#ifdef XML_ATTR_INFO |
XML_AttrInfo *currAttInfo = &attInfo[i]; |
#endif |
/* add the name and value to the attribute list */ |
ATTRIBUTE_ID *attId = getAttributeId(parser, enc, currAtt->name, |
currAtt->name |
+ XmlNameLength(enc, currAtt->name)); |
if (!attId) |
return XML_ERROR_NO_MEMORY; |
#ifdef XML_ATTR_INFO |
currAttInfo->nameStart = parseEndByteIndex - (parseEndPtr - currAtt->name); |
currAttInfo->nameEnd = currAttInfo->nameStart + |
XmlNameLength(enc, currAtt->name); |
currAttInfo->valueStart = parseEndByteIndex - |
(parseEndPtr - currAtt->valuePtr); |
currAttInfo->valueEnd = parseEndByteIndex - (parseEndPtr - currAtt->valueEnd); |
#endif |
/* Detect duplicate attributes by their QNames. This does not work when |
namespace processing is turned on and different prefixes for the same |
namespace are used. For this case we have a check further down. |
*/ |
if ((attId->name)[-1]) { |
if (enc == encoding) |
eventPtr = atts[i].name; |
return XML_ERROR_DUPLICATE_ATTRIBUTE; |
} |
(attId->name)[-1] = 1; |
appAtts[attIndex++] = attId->name; |
if (!atts[i].normalized) { |
enum XML_Error result; |
XML_Bool isCdata = XML_TRUE; |
/* figure out whether declared as other than CDATA */ |
if (attId->maybeTokenized) { |
int j; |
for (j = 0; j < nDefaultAtts; j++) { |
if (attId == elementType->defaultAtts[j].id) { |
isCdata = elementType->defaultAtts[j].isCdata; |
break; |
} |
} |
} |
/* normalize the attribute value */ |
result = storeAttributeValue(parser, enc, isCdata, |
atts[i].valuePtr, atts[i].valueEnd, |
&tempPool); |
if (result) |
return result; |
appAtts[attIndex] = poolStart(&tempPool); |
poolFinish(&tempPool); |
} |
else { |
/* the value did not need normalizing */ |
appAtts[attIndex] = poolStoreString(&tempPool, enc, atts[i].valuePtr, |
atts[i].valueEnd); |
if (appAtts[attIndex] == 0) |
return XML_ERROR_NO_MEMORY; |
poolFinish(&tempPool); |
} |
/* handle prefixed attribute names */ |
if (attId->prefix) { |
if (attId->xmlns) { |
/* deal with namespace declarations here */ |
enum XML_Error result = addBinding(parser, attId->prefix, attId, |
appAtts[attIndex], bindingsPtr); |
if (result) |
return result; |
--attIndex; |
} |
else { |
/* deal with other prefixed names later */ |
attIndex++; |
nPrefixes++; |
(attId->name)[-1] = 2; |
} |
} |
else |
attIndex++; |
} |
/* set-up for XML_GetSpecifiedAttributeCount and XML_GetIdAttributeIndex */ |
nSpecifiedAtts = attIndex; |
if (elementType->idAtt && (elementType->idAtt->name)[-1]) { |
for (i = 0; i < attIndex; i += 2) |
if (appAtts[i] == elementType->idAtt->name) { |
idAttIndex = i; |
break; |
} |
} |
else |
idAttIndex = -1; |
/* do attribute defaulting */ |
for (i = 0; i < nDefaultAtts; i++) { |
const DEFAULT_ATTRIBUTE *da = elementType->defaultAtts + i; |
if (!(da->id->name)[-1] && da->value) { |
if (da->id->prefix) { |
if (da->id->xmlns) { |
enum XML_Error result = addBinding(parser, da->id->prefix, da->id, |
da->value, bindingsPtr); |
if (result) |
return result; |
} |
else { |
(da->id->name)[-1] = 2; |
nPrefixes++; |
appAtts[attIndex++] = da->id->name; |
appAtts[attIndex++] = da->value; |
} |
} |
else { |
(da->id->name)[-1] = 1; |
appAtts[attIndex++] = da->id->name; |
appAtts[attIndex++] = da->value; |
} |
} |
} |
appAtts[attIndex] = 0; |
/* expand prefixed attribute names, check for duplicates, |
and clear flags that say whether attributes were specified */ |
i = 0; |
if (nPrefixes) { |
int j; /* hash table index */ |
unsigned long version = nsAttsVersion; |
int nsAttsSize = (int)1 << nsAttsPower; |
/* size of hash table must be at least 2 * (# of prefixed attributes) */ |
if ((nPrefixes << 1) >> nsAttsPower) { /* true for nsAttsPower = 0 */ |
NS_ATT *temp; |
/* hash table size must also be a power of 2 and >= 8 */ |
while (nPrefixes >> nsAttsPower++); |
if (nsAttsPower < 3) |
nsAttsPower = 3; |
nsAttsSize = (int)1 << nsAttsPower; |
temp = (NS_ATT *)REALLOC(nsAtts, nsAttsSize * sizeof(NS_ATT)); |
if (!temp) |
return XML_ERROR_NO_MEMORY; |
nsAtts = temp; |
version = 0; /* force re-initialization of nsAtts hash table */ |
} |
/* using a version flag saves us from initializing nsAtts every time */ |
if (!version) { /* initialize version flags when version wraps around */ |
version = INIT_ATTS_VERSION; |
for (j = nsAttsSize; j != 0; ) |
nsAtts[--j].version = version; |
} |
nsAttsVersion = --version; |
/* expand prefixed names and check for duplicates */ |
for (; i < attIndex; i += 2) { |
const XML_Char *s = appAtts[i]; |
if (s[-1] == 2) { /* prefixed */ |
ATTRIBUTE_ID *id; |
const BINDING *b; |
unsigned long uriHash = hash_secret_salt; |
((XML_Char *)s)[-1] = 0; /* clear flag */ |
id = (ATTRIBUTE_ID *)lookup(parser, &dtd->attributeIds, s, 0); |
b = id->prefix->binding; |
if (!b) |
return XML_ERROR_UNBOUND_PREFIX; |
/* as we expand the name we also calculate its hash value */ |
for (j = 0; j < b->uriLen; j++) { |
const XML_Char c = b->uri[j]; |
if (!poolAppendChar(&tempPool, c)) |
return XML_ERROR_NO_MEMORY; |
uriHash = CHAR_HASH(uriHash, c); |
} |
while (*s++ != XML_T(ASCII_COLON)) |
; |
do { /* copies null terminator */ |
const XML_Char c = *s; |
if (!poolAppendChar(&tempPool, *s)) |
return XML_ERROR_NO_MEMORY; |
uriHash = CHAR_HASH(uriHash, c); |
} while (*s++); |
{ /* Check hash table for duplicate of expanded name (uriName). |
Derived from code in lookup(parser, HASH_TABLE *table, ...). |
*/ |
unsigned char step = 0; |
unsigned long mask = nsAttsSize - 1; |
j = uriHash & mask; /* index into hash table */ |
while (nsAtts[j].version == version) { |
/* for speed we compare stored hash values first */ |
if (uriHash == nsAtts[j].hash) { |
const XML_Char *s1 = poolStart(&tempPool); |
const XML_Char *s2 = nsAtts[j].uriName; |
/* s1 is null terminated, but not s2 */ |
for (; *s1 == *s2 && *s1 != 0; s1++, s2++); |
if (*s1 == 0) |
return XML_ERROR_DUPLICATE_ATTRIBUTE; |
} |
if (!step) |
step = PROBE_STEP(uriHash, mask, nsAttsPower); |
j < step ? (j += nsAttsSize - step) : (j -= step); |
} |
} |
if (ns_triplets) { /* append namespace separator and prefix */ |
tempPool.ptr[-1] = namespaceSeparator; |
s = b->prefix->name; |
do { |
if (!poolAppendChar(&tempPool, *s)) |
return XML_ERROR_NO_MEMORY; |
} while (*s++); |
} |
/* store expanded name in attribute list */ |
s = poolStart(&tempPool); |
poolFinish(&tempPool); |
appAtts[i] = s; |
/* fill empty slot with new version, uriName and hash value */ |
nsAtts[j].version = version; |
nsAtts[j].hash = uriHash; |
nsAtts[j].uriName = s; |
if (!--nPrefixes) { |
i += 2; |
break; |
} |
} |
else /* not prefixed */ |
((XML_Char *)s)[-1] = 0; /* clear flag */ |
} |
} |
/* clear flags for the remaining attributes */ |
for (; i < attIndex; i += 2) |
((XML_Char *)(appAtts[i]))[-1] = 0; |
for (binding = *bindingsPtr; binding; binding = binding->nextTagBinding) |
binding->attId->name[-1] = 0; |
if (!ns) |
return XML_ERROR_NONE; |
/* expand the element type name */ |
if (elementType->prefix) { |
binding = elementType->prefix->binding; |
if (!binding) |
return XML_ERROR_UNBOUND_PREFIX; |
localPart = tagNamePtr->str; |
while (*localPart++ != XML_T(ASCII_COLON)) |
; |
} |
else if (dtd->defaultPrefix.binding) { |
binding = dtd->defaultPrefix.binding; |
localPart = tagNamePtr->str; |
} |
else |
return XML_ERROR_NONE; |
prefixLen = 0; |
if (ns_triplets && binding->prefix->name) { |
for (; binding->prefix->name[prefixLen++];) |
; /* prefixLen includes null terminator */ |
} |
tagNamePtr->localPart = localPart; |
tagNamePtr->uriLen = binding->uriLen; |
tagNamePtr->prefix = binding->prefix->name; |
tagNamePtr->prefixLen = prefixLen; |
for (i = 0; localPart[i++];) |
; /* i includes null terminator */ |
n = i + binding->uriLen + prefixLen; |
if (n > binding->uriAlloc) { |
TAG *p; |
uri = (XML_Char *)MALLOC((n + EXPAND_SPARE) * sizeof(XML_Char)); |
if (!uri) |
return XML_ERROR_NO_MEMORY; |
binding->uriAlloc = n + EXPAND_SPARE; |
memcpy(uri, binding->uri, binding->uriLen * sizeof(XML_Char)); |
for (p = tagStack; p; p = p->parent) |
if (p->name.str == binding->uri) |
p->name.str = uri; |
FREE(binding->uri); |
binding->uri = uri; |
} |
/* if namespaceSeparator != '\0' then uri includes it already */ |
uri = binding->uri + binding->uriLen; |
memcpy(uri, localPart, i * sizeof(XML_Char)); |
/* we always have a namespace separator between localPart and prefix */ |
if (prefixLen) { |
uri += i - 1; |
*uri = namespaceSeparator; /* replace null terminator */ |
memcpy(uri + 1, binding->prefix->name, prefixLen * sizeof(XML_Char)); |
} |
tagNamePtr->str = binding->uri; |
return XML_ERROR_NONE; |
} |
/* addBinding() overwrites the value of prefix->binding without checking. |
Therefore one must keep track of the old value outside of addBinding(). |
*/ |
static enum XML_Error |
addBinding(XML_Parser parser, PREFIX *prefix, const ATTRIBUTE_ID *attId, |
const XML_Char *uri, BINDING **bindingsPtr) |
{ |
static const XML_Char xmlNamespace[] = { |
ASCII_h, ASCII_t, ASCII_t, ASCII_p, ASCII_COLON, ASCII_SLASH, ASCII_SLASH, |
ASCII_w, ASCII_w, ASCII_w, ASCII_PERIOD, ASCII_w, ASCII_3, ASCII_PERIOD, |
ASCII_o, ASCII_r, ASCII_g, ASCII_SLASH, ASCII_X, ASCII_M, ASCII_L, |
ASCII_SLASH, ASCII_1, ASCII_9, ASCII_9, ASCII_8, ASCII_SLASH, |
ASCII_n, ASCII_a, ASCII_m, ASCII_e, ASCII_s, ASCII_p, ASCII_a, ASCII_c, |
ASCII_e, '\0' |
}; |
static const int xmlLen = |
(int)sizeof(xmlNamespace)/sizeof(XML_Char) - 1; |
static const XML_Char xmlnsNamespace[] = { |
ASCII_h, ASCII_t, ASCII_t, ASCII_p, ASCII_COLON, ASCII_SLASH, ASCII_SLASH, |
ASCII_w, ASCII_w, ASCII_w, ASCII_PERIOD, ASCII_w, ASCII_3, ASCII_PERIOD, |
ASCII_o, ASCII_r, ASCII_g, ASCII_SLASH, ASCII_2, ASCII_0, ASCII_0, |
ASCII_0, ASCII_SLASH, ASCII_x, ASCII_m, ASCII_l, ASCII_n, ASCII_s, |
ASCII_SLASH, '\0' |
}; |
static const int xmlnsLen = |
(int)sizeof(xmlnsNamespace)/sizeof(XML_Char) - 1; |
XML_Bool mustBeXML = XML_FALSE; |
XML_Bool isXML = XML_TRUE; |
XML_Bool isXMLNS = XML_TRUE; |
BINDING *b; |
int len; |
/* empty URI is only valid for default namespace per XML NS 1.0 (not 1.1) */ |
if (*uri == XML_T('\0') && prefix->name) |
return XML_ERROR_UNDECLARING_PREFIX; |
if (prefix->name |
&& prefix->name[0] == XML_T(ASCII_x) |
&& prefix->name[1] == XML_T(ASCII_m) |
&& prefix->name[2] == XML_T(ASCII_l)) { |
/* Not allowed to bind xmlns */ |
if (prefix->name[3] == XML_T(ASCII_n) |
&& prefix->name[4] == XML_T(ASCII_s) |
&& prefix->name[5] == XML_T('\0')) |
return XML_ERROR_RESERVED_PREFIX_XMLNS; |
if (prefix->name[3] == XML_T('\0')) |
mustBeXML = XML_TRUE; |
} |
for (len = 0; uri[len]; len++) { |
if (isXML && (len > xmlLen || uri[len] != xmlNamespace[len])) |
isXML = XML_FALSE; |
if (!mustBeXML && isXMLNS |
&& (len > xmlnsLen || uri[len] != xmlnsNamespace[len])) |
isXMLNS = XML_FALSE; |
} |
isXML = isXML && len == xmlLen; |
isXMLNS = isXMLNS && len == xmlnsLen; |
if (mustBeXML != isXML) |
return mustBeXML ? XML_ERROR_RESERVED_PREFIX_XML |
: XML_ERROR_RESERVED_NAMESPACE_URI; |
if (isXMLNS) |
return XML_ERROR_RESERVED_NAMESPACE_URI; |
if (namespaceSeparator) |
len++; |
if (freeBindingList) { |
b = freeBindingList; |
if (len > b->uriAlloc) { |
XML_Char *temp = (XML_Char *)REALLOC(b->uri, |
sizeof(XML_Char) * (len + EXPAND_SPARE)); |
if (temp == NULL) |
return XML_ERROR_NO_MEMORY; |
b->uri = temp; |
b->uriAlloc = len + EXPAND_SPARE; |
} |
freeBindingList = b->nextTagBinding; |
} |
else { |
b = (BINDING *)MALLOC(sizeof(BINDING)); |
if (!b) |
return XML_ERROR_NO_MEMORY; |
b->uri = (XML_Char *)MALLOC(sizeof(XML_Char) * (len + EXPAND_SPARE)); |
if (!b->uri) { |
FREE(b); |
return XML_ERROR_NO_MEMORY; |
} |
b->uriAlloc = len + EXPAND_SPARE; |
} |
b->uriLen = len; |
memcpy(b->uri, uri, len * sizeof(XML_Char)); |
if (namespaceSeparator) |
b->uri[len - 1] = namespaceSeparator; |
b->prefix = prefix; |
b->attId = attId; |
b->prevPrefixBinding = prefix->binding; |
/* NULL binding when default namespace undeclared */ |
if (*uri == XML_T('\0') && prefix == &_dtd->defaultPrefix) |
prefix->binding = NULL; |
else |
prefix->binding = b; |
b->nextTagBinding = *bindingsPtr; |
*bindingsPtr = b; |
/* if attId == NULL then we are not starting a namespace scope */ |
if (attId && startNamespaceDeclHandler) |
startNamespaceDeclHandler(handlerArg, prefix->name, |
prefix->binding ? uri : 0); |
return XML_ERROR_NONE; |
} |
/* The idea here is to avoid using stack for each CDATA section when |
the whole file is parsed with one call. |
*/ |
static enum XML_Error PTRCALL |
cdataSectionProcessor(XML_Parser parser, |
const char *start, |
const char *end, |
const char **endPtr) |
{ |
enum XML_Error result = doCdataSection(parser, encoding, &start, end, |
endPtr, (XML_Bool)!ps_finalBuffer); |
if (result != XML_ERROR_NONE) |
return result; |
if (start) { |
if (parentParser) { /* we are parsing an external entity */ |
processor = externalEntityContentProcessor; |
return externalEntityContentProcessor(parser, start, end, endPtr); |
} |
else { |
processor = contentProcessor; |
return contentProcessor(parser, start, end, endPtr); |
} |
} |
return result; |
} |
/* startPtr gets set to non-null if the section is closed, and to null if |
the section is not yet closed. |
*/ |
static enum XML_Error |
doCdataSection(XML_Parser parser, |
const ENCODING *enc, |
const char **startPtr, |
const char *end, |
const char **nextPtr, |
XML_Bool haveMore) |
{ |
const char *s = *startPtr; |
const char **eventPP; |
const char **eventEndPP; |
if (enc == encoding) { |
eventPP = &eventPtr; |
*eventPP = s; |
eventEndPP = &eventEndPtr; |
} |
else { |
eventPP = &(openInternalEntities->internalEventPtr); |
eventEndPP = &(openInternalEntities->internalEventEndPtr); |
} |
*eventPP = s; |
*startPtr = NULL; |
for (;;) { |
const char *next; |
int tok = XmlCdataSectionTok(enc, s, end, &next); |
*eventEndPP = next; |
switch (tok) { |
case XML_TOK_CDATA_SECT_CLOSE: |
if (endCdataSectionHandler) |
endCdataSectionHandler(handlerArg); |
#if 0 |
/* see comment under XML_TOK_CDATA_SECT_OPEN */ |
else if (characterDataHandler) |
characterDataHandler(handlerArg, dataBuf, 0); |
#endif |
else if (defaultHandler) |
reportDefault(parser, enc, s, next); |
*startPtr = next; |
*nextPtr = next; |
if (ps_parsing == XML_FINISHED) |
return XML_ERROR_ABORTED; |
else |
return XML_ERROR_NONE; |
case XML_TOK_DATA_NEWLINE: |
if (characterDataHandler) { |
XML_Char c = 0xA; |
characterDataHandler(handlerArg, &c, 1); |
} |
else if (defaultHandler) |
reportDefault(parser, enc, s, next); |
break; |
case XML_TOK_DATA_CHARS: |
{ |
XML_CharacterDataHandler charDataHandler = characterDataHandler; |
if (charDataHandler) { |
if (MUST_CONVERT(enc, s)) { |
for (;;) { |
ICHAR *dataPtr = (ICHAR *)dataBuf; |
XmlConvert(enc, &s, next, &dataPtr, (ICHAR *)dataBufEnd); |
*eventEndPP = next; |
charDataHandler(handlerArg, dataBuf, |
(int)(dataPtr - (ICHAR *)dataBuf)); |
if (s == next) |
break; |
*eventPP = s; |
} |
} |
else |
charDataHandler(handlerArg, |
(XML_Char *)s, |
(int)((XML_Char *)next - (XML_Char *)s)); |
} |
else if (defaultHandler) |
reportDefault(parser, enc, s, next); |
} |
break; |
case XML_TOK_INVALID: |
*eventPP = next; |
return XML_ERROR_INVALID_TOKEN; |
case XML_TOK_PARTIAL_CHAR: |
if (haveMore) { |
*nextPtr = s; |
return XML_ERROR_NONE; |
} |
return XML_ERROR_PARTIAL_CHAR; |
case XML_TOK_PARTIAL: |
case XML_TOK_NONE: |
if (haveMore) { |
*nextPtr = s; |
return XML_ERROR_NONE; |
} |
return XML_ERROR_UNCLOSED_CDATA_SECTION; |
default: |
*eventPP = next; |
return XML_ERROR_UNEXPECTED_STATE; |
} |
*eventPP = s = next; |
switch (ps_parsing) { |
case XML_SUSPENDED: |
*nextPtr = next; |
return XML_ERROR_NONE; |
case XML_FINISHED: |
return XML_ERROR_ABORTED; |
default: ; |
} |
} |
/* not reached */ |
} |
#ifdef XML_DTD |
/* The idea here is to avoid using stack for each IGNORE section when |
the whole file is parsed with one call. |
*/ |
static enum XML_Error PTRCALL |
ignoreSectionProcessor(XML_Parser parser, |
const char *start, |
const char *end, |
const char **endPtr) |
{ |
enum XML_Error result = doIgnoreSection(parser, encoding, &start, end, |
endPtr, (XML_Bool)!ps_finalBuffer); |
if (result != XML_ERROR_NONE) |
return result; |
if (start) { |
processor = prologProcessor; |
return prologProcessor(parser, start, end, endPtr); |
} |
return result; |
} |
/* startPtr gets set to non-null is the section is closed, and to null |
if the section is not yet closed. |
*/ |
static enum XML_Error |
doIgnoreSection(XML_Parser parser, |
const ENCODING *enc, |
const char **startPtr, |
const char *end, |
const char **nextPtr, |
XML_Bool haveMore) |
{ |
const char *next; |
int tok; |
const char *s = *startPtr; |
const char **eventPP; |
const char **eventEndPP; |
if (enc == encoding) { |
eventPP = &eventPtr; |
*eventPP = s; |
eventEndPP = &eventEndPtr; |
} |
else { |
eventPP = &(openInternalEntities->internalEventPtr); |
eventEndPP = &(openInternalEntities->internalEventEndPtr); |
} |
*eventPP = s; |
*startPtr = NULL; |
tok = XmlIgnoreSectionTok(enc, s, end, &next); |
*eventEndPP = next; |
switch (tok) { |
case XML_TOK_IGNORE_SECT: |
if (defaultHandler) |
reportDefault(parser, enc, s, next); |
*startPtr = next; |
*nextPtr = next; |
if (ps_parsing == XML_FINISHED) |
return XML_ERROR_ABORTED; |
else |
return XML_ERROR_NONE; |
case XML_TOK_INVALID: |
*eventPP = next; |
return XML_ERROR_INVALID_TOKEN; |
case XML_TOK_PARTIAL_CHAR: |
if (haveMore) { |
*nextPtr = s; |
return XML_ERROR_NONE; |
} |
return XML_ERROR_PARTIAL_CHAR; |
case XML_TOK_PARTIAL: |
case XML_TOK_NONE: |
if (haveMore) { |
*nextPtr = s; |
return XML_ERROR_NONE; |
} |
return XML_ERROR_SYNTAX; /* XML_ERROR_UNCLOSED_IGNORE_SECTION */ |
default: |
*eventPP = next; |
return XML_ERROR_UNEXPECTED_STATE; |
} |
/* not reached */ |
} |
#endif /* XML_DTD */ |
static enum XML_Error |
initializeEncoding(XML_Parser parser) |
{ |
const char *s; |
#ifdef XML_UNICODE |
char encodingBuf[128]; |
if (!protocolEncodingName) |
s = NULL; |
else { |
int i; |
for (i = 0; protocolEncodingName[i]; i++) { |
if (i == sizeof(encodingBuf) - 1 |
|| (protocolEncodingName[i] & ~0x7f) != 0) { |
encodingBuf[0] = '\0'; |
break; |
} |
encodingBuf[i] = (char)protocolEncodingName[i]; |
} |
encodingBuf[i] = '\0'; |
s = encodingBuf; |
} |
#else |
s = protocolEncodingName; |
#endif |
if ((ns ? XmlInitEncodingNS : XmlInitEncoding)(&initEncoding, &encoding, s)) |
return XML_ERROR_NONE; |
return handleUnknownEncoding(parser, protocolEncodingName); |
} |
static enum XML_Error |
processXmlDecl(XML_Parser parser, int isGeneralTextEntity, |
const char *s, const char *next) |
{ |
const char *encodingName = NULL; |
const XML_Char *storedEncName = NULL; |
const ENCODING *newEncoding = NULL; |
const char *version = NULL; |
const char *versionend; |
const XML_Char *storedversion = NULL; |
int standalone = -1; |
if (!(ns |
? XmlParseXmlDeclNS |
: XmlParseXmlDecl)(isGeneralTextEntity, |
encoding, |
s, |
next, |
&eventPtr, |
&version, |
&versionend, |
&encodingName, |
&newEncoding, |
&standalone)) { |
if (isGeneralTextEntity) |
return XML_ERROR_TEXT_DECL; |
else |
return XML_ERROR_XML_DECL; |
} |
if (!isGeneralTextEntity && standalone == 1) { |
_dtd->standalone = XML_TRUE; |
#ifdef XML_DTD |
if (paramEntityParsing == XML_PARAM_ENTITY_PARSING_UNLESS_STANDALONE) |
paramEntityParsing = XML_PARAM_ENTITY_PARSING_NEVER; |
#endif /* XML_DTD */ |
} |
if (xmlDeclHandler) { |
if (encodingName != NULL) { |
storedEncName = poolStoreString(&temp2Pool, |
encoding, |
encodingName, |
encodingName |
+ XmlNameLength(encoding, encodingName)); |
if (!storedEncName) |
return XML_ERROR_NO_MEMORY; |
poolFinish(&temp2Pool); |
} |
if (version) { |
storedversion = poolStoreString(&temp2Pool, |
encoding, |
version, |
versionend - encoding->minBytesPerChar); |
if (!storedversion) |
return XML_ERROR_NO_MEMORY; |
} |
xmlDeclHandler(handlerArg, storedversion, storedEncName, standalone); |
} |
else if (defaultHandler) |
reportDefault(parser, encoding, s, next); |
if (protocolEncodingName == NULL) { |
if (newEncoding) { |
if (newEncoding->minBytesPerChar != encoding->minBytesPerChar) { |
eventPtr = encodingName; |
return XML_ERROR_INCORRECT_ENCODING; |
} |
encoding = newEncoding; |
} |
else if (encodingName) { |
enum XML_Error result; |
if (!storedEncName) { |
storedEncName = poolStoreString( |
&temp2Pool, encoding, encodingName, |
encodingName + XmlNameLength(encoding, encodingName)); |
if (!storedEncName) |
return XML_ERROR_NO_MEMORY; |
} |
result = handleUnknownEncoding(parser, storedEncName); |
poolClear(&temp2Pool); |
if (result == XML_ERROR_UNKNOWN_ENCODING) |
eventPtr = encodingName; |
return result; |
} |
} |
if (storedEncName || storedversion) |
poolClear(&temp2Pool); |
return XML_ERROR_NONE; |
} |
static enum XML_Error |
handleUnknownEncoding(XML_Parser parser, const XML_Char *encodingName) |
{ |
if (unknownEncodingHandler) { |
XML_Encoding info; |
int i; |
for (i = 0; i < 256; i++) |
info.map[i] = -1; |
info.convert = NULL; |
info.data = NULL; |
info.release = NULL; |
if (unknownEncodingHandler(unknownEncodingHandlerData, encodingName, |
&info)) { |
ENCODING *enc; |
unknownEncodingMem = MALLOC(XmlSizeOfUnknownEncoding()); |
if (!unknownEncodingMem) { |
if (info.release) |
info.release(info.data); |
return XML_ERROR_NO_MEMORY; |
} |
enc = (ns |
? XmlInitUnknownEncodingNS |
: XmlInitUnknownEncoding)(unknownEncodingMem, |
info.map, |
info.convert, |
info.data); |
if (enc) { |
unknownEncodingData = info.data; |
unknownEncodingRelease = info.release; |
encoding = enc; |
return XML_ERROR_NONE; |
} |
} |
if (info.release != NULL) |
info.release(info.data); |
} |
return XML_ERROR_UNKNOWN_ENCODING; |
} |
static enum XML_Error PTRCALL |
prologInitProcessor(XML_Parser parser, |
const char *s, |
const char *end, |
const char **nextPtr) |
{ |
enum XML_Error result = initializeEncoding(parser); |
if (result != XML_ERROR_NONE) |
return result; |
processor = prologProcessor; |
return prologProcessor(parser, s, end, nextPtr); |
} |
#ifdef XML_DTD |
static enum XML_Error PTRCALL |
externalParEntInitProcessor(XML_Parser parser, |
const char *s, |
const char *end, |
const char **nextPtr) |
{ |
enum XML_Error result = initializeEncoding(parser); |
if (result != XML_ERROR_NONE) |
return result; |
/* we know now that XML_Parse(Buffer) has been called, |
so we consider the external parameter entity read */ |
_dtd->paramEntityRead = XML_TRUE; |
if (prologState.inEntityValue) { |
processor = entityValueInitProcessor; |
return entityValueInitProcessor(parser, s, end, nextPtr); |
} |
else { |
processor = externalParEntProcessor; |
return externalParEntProcessor(parser, s, end, nextPtr); |
} |
} |
static enum XML_Error PTRCALL |
entityValueInitProcessor(XML_Parser parser, |
const char *s, |
const char *end, |
const char **nextPtr) |
{ |
int tok; |
const char *start = s; |
const char *next = start; |
eventPtr = start; |
for (;;) { |
tok = XmlPrologTok(encoding, start, end, &next); |
eventEndPtr = next; |
if (tok <= 0) { |
if (!ps_finalBuffer && tok != XML_TOK_INVALID) { |
*nextPtr = s; |
return XML_ERROR_NONE; |
} |
switch (tok) { |
case XML_TOK_INVALID: |
return XML_ERROR_INVALID_TOKEN; |
case XML_TOK_PARTIAL: |
return XML_ERROR_UNCLOSED_TOKEN; |
case XML_TOK_PARTIAL_CHAR: |
return XML_ERROR_PARTIAL_CHAR; |
case XML_TOK_NONE: /* start == end */ |
default: |
break; |
} |
/* found end of entity value - can store it now */ |
return storeEntityValue(parser, encoding, s, end); |
} |
else if (tok == XML_TOK_XML_DECL) { |
enum XML_Error result; |
result = processXmlDecl(parser, 0, start, next); |
if (result != XML_ERROR_NONE) |
return result; |
switch (ps_parsing) { |
case XML_SUSPENDED: |
*nextPtr = next; |
return XML_ERROR_NONE; |
case XML_FINISHED: |
return XML_ERROR_ABORTED; |
default: |
*nextPtr = next; |
} |
/* stop scanning for text declaration - we found one */ |
processor = entityValueProcessor; |
return entityValueProcessor(parser, next, end, nextPtr); |
} |
/* If we are at the end of the buffer, this would cause XmlPrologTok to |
return XML_TOK_NONE on the next call, which would then cause the |
function to exit with *nextPtr set to s - that is what we want for other |
tokens, but not for the BOM - we would rather like to skip it; |
then, when this routine is entered the next time, XmlPrologTok will |
return XML_TOK_INVALID, since the BOM is still in the buffer |
*/ |
else if (tok == XML_TOK_BOM && next == end && !ps_finalBuffer) { |
*nextPtr = next; |
return XML_ERROR_NONE; |
} |
start = next; |
eventPtr = start; |
} |
} |
static enum XML_Error PTRCALL |
externalParEntProcessor(XML_Parser parser, |
const char *s, |
const char *end, |
const char **nextPtr) |
{ |
const char *next = s; |
int tok; |
tok = XmlPrologTok(encoding, s, end, &next); |
if (tok <= 0) { |
if (!ps_finalBuffer && tok != XML_TOK_INVALID) { |
*nextPtr = s; |
return XML_ERROR_NONE; |
} |
switch (tok) { |
case XML_TOK_INVALID: |
return XML_ERROR_INVALID_TOKEN; |
case XML_TOK_PARTIAL: |
return XML_ERROR_UNCLOSED_TOKEN; |
case XML_TOK_PARTIAL_CHAR: |
return XML_ERROR_PARTIAL_CHAR; |
case XML_TOK_NONE: /* start == end */ |
default: |
break; |
} |
} |
/* This would cause the next stage, i.e. doProlog to be passed XML_TOK_BOM. |
However, when parsing an external subset, doProlog will not accept a BOM |
as valid, and report a syntax error, so we have to skip the BOM |
*/ |
else if (tok == XML_TOK_BOM) { |
s = next; |
tok = XmlPrologTok(encoding, s, end, &next); |
} |
processor = prologProcessor; |
return doProlog(parser, encoding, s, end, tok, next, |
nextPtr, (XML_Bool)!ps_finalBuffer); |
} |
static enum XML_Error PTRCALL |
entityValueProcessor(XML_Parser parser, |
const char *s, |
const char *end, |
const char **nextPtr) |
{ |
const char *start = s; |
const char *next = s; |
const ENCODING *enc = encoding; |
int tok; |
for (;;) { |
tok = XmlPrologTok(enc, start, end, &next); |
if (tok <= 0) { |
if (!ps_finalBuffer && tok != XML_TOK_INVALID) { |
*nextPtr = s; |
return XML_ERROR_NONE; |
} |
switch (tok) { |
case XML_TOK_INVALID: |
return XML_ERROR_INVALID_TOKEN; |
case XML_TOK_PARTIAL: |
return XML_ERROR_UNCLOSED_TOKEN; |
case XML_TOK_PARTIAL_CHAR: |
return XML_ERROR_PARTIAL_CHAR; |
case XML_TOK_NONE: /* start == end */ |
default: |
break; |
} |
/* found end of entity value - can store it now */ |
return storeEntityValue(parser, enc, s, end); |
} |
start = next; |
} |
} |
#endif /* XML_DTD */ |
static enum XML_Error PTRCALL |
prologProcessor(XML_Parser parser, |
const char *s, |
const char *end, |
const char **nextPtr) |
{ |
const char *next = s; |
int tok = XmlPrologTok(encoding, s, end, &next); |
return doProlog(parser, encoding, s, end, tok, next, |
nextPtr, (XML_Bool)!ps_finalBuffer); |
} |
static enum XML_Error |
doProlog(XML_Parser parser, |
const ENCODING *enc, |
const char *s, |
const char *end, |
int tok, |
const char *next, |
const char **nextPtr, |
XML_Bool haveMore) |
{ |
#ifdef XML_DTD |
static const XML_Char externalSubsetName[] = { ASCII_HASH , '\0' }; |
#endif /* XML_DTD */ |
static const XML_Char atypeCDATA[] = |
{ ASCII_C, ASCII_D, ASCII_A, ASCII_T, ASCII_A, '\0' }; |
static const XML_Char atypeID[] = { ASCII_I, ASCII_D, '\0' }; |
static const XML_Char atypeIDREF[] = |
{ ASCII_I, ASCII_D, ASCII_R, ASCII_E, ASCII_F, '\0' }; |
static const XML_Char atypeIDREFS[] = |
{ ASCII_I, ASCII_D, ASCII_R, ASCII_E, ASCII_F, ASCII_S, '\0' }; |
static const XML_Char atypeENTITY[] = |
{ ASCII_E, ASCII_N, ASCII_T, ASCII_I, ASCII_T, ASCII_Y, '\0' }; |
static const XML_Char atypeENTITIES[] = { ASCII_E, ASCII_N, |
ASCII_T, ASCII_I, ASCII_T, ASCII_I, ASCII_E, ASCII_S, '\0' }; |
static const XML_Char atypeNMTOKEN[] = { |
ASCII_N, ASCII_M, ASCII_T, ASCII_O, ASCII_K, ASCII_E, ASCII_N, '\0' }; |
static const XML_Char atypeNMTOKENS[] = { ASCII_N, ASCII_M, ASCII_T, |
ASCII_O, ASCII_K, ASCII_E, ASCII_N, ASCII_S, '\0' }; |
static const XML_Char notationPrefix[] = { ASCII_N, ASCII_O, ASCII_T, |
ASCII_A, ASCII_T, ASCII_I, ASCII_O, ASCII_N, ASCII_LPAREN, '\0' }; |
static const XML_Char enumValueSep[] = { ASCII_PIPE, '\0' }; |
static const XML_Char enumValueStart[] = { ASCII_LPAREN, '\0' }; |
/* save one level of indirection */ |
DTD * const dtd = _dtd; |
const char **eventPP; |
const char **eventEndPP; |
enum XML_Content_Quant quant; |
if (enc == encoding) { |
eventPP = &eventPtr; |
eventEndPP = &eventEndPtr; |
} |
else { |
eventPP = &(openInternalEntities->internalEventPtr); |
eventEndPP = &(openInternalEntities->internalEventEndPtr); |
} |
for (;;) { |
int role; |
XML_Bool handleDefault = XML_TRUE; |
*eventPP = s; |
*eventEndPP = next; |
if (tok <= 0) { |
if (haveMore && tok != XML_TOK_INVALID) { |
*nextPtr = s; |
return XML_ERROR_NONE; |
} |
switch (tok) { |
case XML_TOK_INVALID: |
*eventPP = next; |
return XML_ERROR_INVALID_TOKEN; |
case XML_TOK_PARTIAL: |
return XML_ERROR_UNCLOSED_TOKEN; |
case XML_TOK_PARTIAL_CHAR: |
return XML_ERROR_PARTIAL_CHAR; |
case -XML_TOK_PROLOG_S: |
tok = -tok; |
break; |
case XML_TOK_NONE: |
#ifdef XML_DTD |
/* for internal PE NOT referenced between declarations */ |
if (enc != encoding && !openInternalEntities->betweenDecl) { |
*nextPtr = s; |
return XML_ERROR_NONE; |
} |
/* WFC: PE Between Declarations - must check that PE contains |
complete markup, not only for external PEs, but also for |
internal PEs if the reference occurs between declarations. |
*/ |
if (isParamEntity || enc != encoding) { |
if (XmlTokenRole(&prologState, XML_TOK_NONE, end, end, enc) |
== XML_ROLE_ERROR) |
return XML_ERROR_INCOMPLETE_PE; |
*nextPtr = s; |
return XML_ERROR_NONE; |
} |
#endif /* XML_DTD */ |
return XML_ERROR_NO_ELEMENTS; |
default: |
tok = -tok; |
next = end; |
break; |
} |
} |
role = XmlTokenRole(&prologState, tok, s, next, enc); |
switch (role) { |
case XML_ROLE_XML_DECL: |
{ |
enum XML_Error result = processXmlDecl(parser, 0, s, next); |
if (result != XML_ERROR_NONE) |
return result; |
enc = encoding; |
handleDefault = XML_FALSE; |
} |
break; |
case XML_ROLE_DOCTYPE_NAME: |
if (startDoctypeDeclHandler) { |
doctypeName = poolStoreString(&tempPool, enc, s, next); |
if (!doctypeName) |
return XML_ERROR_NO_MEMORY; |
poolFinish(&tempPool); |
doctypePubid = NULL; |
handleDefault = XML_FALSE; |
} |
doctypeSysid = NULL; /* always initialize to NULL */ |
break; |
case XML_ROLE_DOCTYPE_INTERNAL_SUBSET: |
if (startDoctypeDeclHandler) { |
startDoctypeDeclHandler(handlerArg, doctypeName, doctypeSysid, |
doctypePubid, 1); |
doctypeName = NULL; |
poolClear(&tempPool); |
handleDefault = XML_FALSE; |
} |
break; |
#ifdef XML_DTD |
case XML_ROLE_TEXT_DECL: |
{ |
enum XML_Error result = processXmlDecl(parser, 1, s, next); |
if (result != XML_ERROR_NONE) |
return result; |
enc = encoding; |
handleDefault = XML_FALSE; |
} |
break; |
#endif /* XML_DTD */ |
case XML_ROLE_DOCTYPE_PUBLIC_ID: |
#ifdef XML_DTD |
useForeignDTD = XML_FALSE; |
declEntity = (ENTITY *)lookup(parser, |
&dtd->paramEntities, |
externalSubsetName, |
sizeof(ENTITY)); |
if (!declEntity) |
return XML_ERROR_NO_MEMORY; |
#endif /* XML_DTD */ |
dtd->hasParamEntityRefs = XML_TRUE; |
if (startDoctypeDeclHandler) { |
XML_Char *pubId; |
if (!XmlIsPublicId(enc, s, next, eventPP)) |
return XML_ERROR_PUBLICID; |
pubId = poolStoreString(&tempPool, enc, |
s + enc->minBytesPerChar, |
next - enc->minBytesPerChar); |
if (!pubId) |
return XML_ERROR_NO_MEMORY; |
normalizePublicId(pubId); |
poolFinish(&tempPool); |
doctypePubid = pubId; |
handleDefault = XML_FALSE; |
goto alreadyChecked; |
} |
/* fall through */ |
case XML_ROLE_ENTITY_PUBLIC_ID: |
if (!XmlIsPublicId(enc, s, next, eventPP)) |
return XML_ERROR_PUBLICID; |
alreadyChecked: |
if (dtd->keepProcessing && declEntity) { |
XML_Char *tem = poolStoreString(&dtd->pool, |
enc, |
s + enc->minBytesPerChar, |
next - enc->minBytesPerChar); |
if (!tem) |
return XML_ERROR_NO_MEMORY; |
normalizePublicId(tem); |
declEntity->publicId = tem; |
poolFinish(&dtd->pool); |
if (entityDeclHandler) |
handleDefault = XML_FALSE; |
} |
break; |
case XML_ROLE_DOCTYPE_CLOSE: |
if (doctypeName) { |
startDoctypeDeclHandler(handlerArg, doctypeName, |
doctypeSysid, doctypePubid, 0); |
poolClear(&tempPool); |
handleDefault = XML_FALSE; |
} |
/* doctypeSysid will be non-NULL in the case of a previous |
XML_ROLE_DOCTYPE_SYSTEM_ID, even if startDoctypeDeclHandler |
was not set, indicating an external subset |
*/ |
#ifdef XML_DTD |
if (doctypeSysid || useForeignDTD) { |
XML_Bool hadParamEntityRefs = dtd->hasParamEntityRefs; |
dtd->hasParamEntityRefs = XML_TRUE; |
if (paramEntityParsing && externalEntityRefHandler) { |
ENTITY *entity = (ENTITY *)lookup(parser, |
&dtd->paramEntities, |
externalSubsetName, |
sizeof(ENTITY)); |
if (!entity) |
return XML_ERROR_NO_MEMORY; |
if (useForeignDTD) |
entity->base = curBase; |
dtd->paramEntityRead = XML_FALSE; |
if (!externalEntityRefHandler(externalEntityRefHandlerArg, |
0, |
entity->base, |
entity->systemId, |
entity->publicId)) |
return XML_ERROR_EXTERNAL_ENTITY_HANDLING; |
if (dtd->paramEntityRead) { |
if (!dtd->standalone && |
notStandaloneHandler && |
!notStandaloneHandler(handlerArg)) |
return XML_ERROR_NOT_STANDALONE; |
} |
/* if we didn't read the foreign DTD then this means that there |
is no external subset and we must reset dtd->hasParamEntityRefs |
*/ |
else if (!doctypeSysid) |
dtd->hasParamEntityRefs = hadParamEntityRefs; |
/* end of DTD - no need to update dtd->keepProcessing */ |
} |
useForeignDTD = XML_FALSE; |
} |
#endif /* XML_DTD */ |
if (endDoctypeDeclHandler) { |
endDoctypeDeclHandler(handlerArg); |
handleDefault = XML_FALSE; |
} |
break; |
case XML_ROLE_INSTANCE_START: |
#ifdef XML_DTD |
/* if there is no DOCTYPE declaration then now is the |
last chance to read the foreign DTD |
*/ |
if (useForeignDTD) { |
XML_Bool hadParamEntityRefs = dtd->hasParamEntityRefs; |
dtd->hasParamEntityRefs = XML_TRUE; |
if (paramEntityParsing && externalEntityRefHandler) { |
ENTITY *entity = (ENTITY *)lookup(parser, &dtd->paramEntities, |
externalSubsetName, |
sizeof(ENTITY)); |
if (!entity) |
return XML_ERROR_NO_MEMORY; |
entity->base = curBase; |
dtd->paramEntityRead = XML_FALSE; |
if (!externalEntityRefHandler(externalEntityRefHandlerArg, |
0, |
entity->base, |
entity->systemId, |
entity->publicId)) |
return XML_ERROR_EXTERNAL_ENTITY_HANDLING; |
if (dtd->paramEntityRead) { |
if (!dtd->standalone && |
notStandaloneHandler && |
!notStandaloneHandler(handlerArg)) |
return XML_ERROR_NOT_STANDALONE; |
} |
/* if we didn't read the foreign DTD then this means that there |
is no external subset and we must reset dtd->hasParamEntityRefs |
*/ |
else |
dtd->hasParamEntityRefs = hadParamEntityRefs; |
/* end of DTD - no need to update dtd->keepProcessing */ |
} |
} |
#endif /* XML_DTD */ |
processor = contentProcessor; |
return contentProcessor(parser, s, end, nextPtr); |
case XML_ROLE_ATTLIST_ELEMENT_NAME: |
declElementType = getElementType(parser, enc, s, next); |
if (!declElementType) |
return XML_ERROR_NO_MEMORY; |
goto checkAttListDeclHandler; |
case XML_ROLE_ATTRIBUTE_NAME: |
declAttributeId = getAttributeId(parser, enc, s, next); |
if (!declAttributeId) |
return XML_ERROR_NO_MEMORY; |
declAttributeIsCdata = XML_FALSE; |
declAttributeType = NULL; |
declAttributeIsId = XML_FALSE; |
goto checkAttListDeclHandler; |
case XML_ROLE_ATTRIBUTE_TYPE_CDATA: |
declAttributeIsCdata = XML_TRUE; |
declAttributeType = atypeCDATA; |
goto checkAttListDeclHandler; |
case XML_ROLE_ATTRIBUTE_TYPE_ID: |
declAttributeIsId = XML_TRUE; |
declAttributeType = atypeID; |
goto checkAttListDeclHandler; |
case XML_ROLE_ATTRIBUTE_TYPE_IDREF: |
declAttributeType = atypeIDREF; |
goto checkAttListDeclHandler; |
case XML_ROLE_ATTRIBUTE_TYPE_IDREFS: |
declAttributeType = atypeIDREFS; |
goto checkAttListDeclHandler; |
case XML_ROLE_ATTRIBUTE_TYPE_ENTITY: |
declAttributeType = atypeENTITY; |
goto checkAttListDeclHandler; |
case XML_ROLE_ATTRIBUTE_TYPE_ENTITIES: |
declAttributeType = atypeENTITIES; |
goto checkAttListDeclHandler; |
case XML_ROLE_ATTRIBUTE_TYPE_NMTOKEN: |
declAttributeType = atypeNMTOKEN; |
goto checkAttListDeclHandler; |
case XML_ROLE_ATTRIBUTE_TYPE_NMTOKENS: |
declAttributeType = atypeNMTOKENS; |
checkAttListDeclHandler: |
if (dtd->keepProcessing && attlistDeclHandler) |
handleDefault = XML_FALSE; |
break; |
case XML_ROLE_ATTRIBUTE_ENUM_VALUE: |
case XML_ROLE_ATTRIBUTE_NOTATION_VALUE: |
if (dtd->keepProcessing && attlistDeclHandler) { |
const XML_Char *prefix; |
if (declAttributeType) { |
prefix = enumValueSep; |
} |
else { |
prefix = (role == XML_ROLE_ATTRIBUTE_NOTATION_VALUE |
? notationPrefix |
: enumValueStart); |
} |
if (!poolAppendString(&tempPool, prefix)) |
return XML_ERROR_NO_MEMORY; |
if (!poolAppend(&tempPool, enc, s, next)) |
return XML_ERROR_NO_MEMORY; |
declAttributeType = tempPool.start; |
handleDefault = XML_FALSE; |
} |
break; |
case XML_ROLE_IMPLIED_ATTRIBUTE_VALUE: |
case XML_ROLE_REQUIRED_ATTRIBUTE_VALUE: |
if (dtd->keepProcessing) { |
if (!defineAttribute(declElementType, declAttributeId, |
declAttributeIsCdata, declAttributeIsId, |
0, parser)) |
return XML_ERROR_NO_MEMORY; |
if (attlistDeclHandler && declAttributeType) { |
if (*declAttributeType == XML_T(ASCII_LPAREN) |
|| (*declAttributeType == XML_T(ASCII_N) |
&& declAttributeType[1] == XML_T(ASCII_O))) { |
/* Enumerated or Notation type */ |
if (!poolAppendChar(&tempPool, XML_T(ASCII_RPAREN)) |
|| !poolAppendChar(&tempPool, XML_T('\0'))) |
return XML_ERROR_NO_MEMORY; |
declAttributeType = tempPool.start; |
poolFinish(&tempPool); |
} |
*eventEndPP = s; |
attlistDeclHandler(handlerArg, declElementType->name, |
declAttributeId->name, declAttributeType, |
0, role == XML_ROLE_REQUIRED_ATTRIBUTE_VALUE); |
poolClear(&tempPool); |
handleDefault = XML_FALSE; |
} |
} |
break; |
case XML_ROLE_DEFAULT_ATTRIBUTE_VALUE: |
case XML_ROLE_FIXED_ATTRIBUTE_VALUE: |
if (dtd->keepProcessing) { |
const XML_Char *attVal; |
enum XML_Error result = |
storeAttributeValue(parser, enc, declAttributeIsCdata, |
s + enc->minBytesPerChar, |
next - enc->minBytesPerChar, |
&dtd->pool); |
if (result) |
return result; |
attVal = poolStart(&dtd->pool); |
poolFinish(&dtd->pool); |
/* ID attributes aren't allowed to have a default */ |
if (!defineAttribute(declElementType, declAttributeId, |
declAttributeIsCdata, XML_FALSE, attVal, parser)) |
return XML_ERROR_NO_MEMORY; |
if (attlistDeclHandler && declAttributeType) { |
if (*declAttributeType == XML_T(ASCII_LPAREN) |
|| (*declAttributeType == XML_T(ASCII_N) |
&& declAttributeType[1] == XML_T(ASCII_O))) { |
/* Enumerated or Notation type */ |
if (!poolAppendChar(&tempPool, XML_T(ASCII_RPAREN)) |
|| !poolAppendChar(&tempPool, XML_T('\0'))) |
return XML_ERROR_NO_MEMORY; |
declAttributeType = tempPool.start; |
poolFinish(&tempPool); |
} |
*eventEndPP = s; |
attlistDeclHandler(handlerArg, declElementType->name, |
declAttributeId->name, declAttributeType, |
attVal, |
role == XML_ROLE_FIXED_ATTRIBUTE_VALUE); |
poolClear(&tempPool); |
handleDefault = XML_FALSE; |
} |
} |
break; |
case XML_ROLE_ENTITY_VALUE: |
if (dtd->keepProcessing) { |
enum XML_Error result = storeEntityValue(parser, enc, |
s + enc->minBytesPerChar, |
next - enc->minBytesPerChar); |
if (declEntity) { |
declEntity->textPtr = poolStart(&dtd->entityValuePool); |
declEntity->textLen = (int)(poolLength(&dtd->entityValuePool)); |
poolFinish(&dtd->entityValuePool); |
if (entityDeclHandler) { |
*eventEndPP = s; |
entityDeclHandler(handlerArg, |
declEntity->name, |
declEntity->is_param, |
declEntity->textPtr, |
declEntity->textLen, |
curBase, 0, 0, 0); |
handleDefault = XML_FALSE; |
} |
} |
else |
poolDiscard(&dtd->entityValuePool); |
if (result != XML_ERROR_NONE) |
return result; |
} |
break; |
case XML_ROLE_DOCTYPE_SYSTEM_ID: |
#ifdef XML_DTD |
useForeignDTD = XML_FALSE; |
#endif /* XML_DTD */ |
dtd->hasParamEntityRefs = XML_TRUE; |
if (startDoctypeDeclHandler) { |
doctypeSysid = poolStoreString(&tempPool, enc, |
s + enc->minBytesPerChar, |
next - enc->minBytesPerChar); |
if (doctypeSysid == NULL) |
return XML_ERROR_NO_MEMORY; |
poolFinish(&tempPool); |
handleDefault = XML_FALSE; |
} |
#ifdef XML_DTD |
else |
/* use externalSubsetName to make doctypeSysid non-NULL |
for the case where no startDoctypeDeclHandler is set */ |
doctypeSysid = externalSubsetName; |
#endif /* XML_DTD */ |
if (!dtd->standalone |
#ifdef XML_DTD |
&& !paramEntityParsing |
#endif /* XML_DTD */ |
&& notStandaloneHandler |
&& !notStandaloneHandler(handlerArg)) |
return XML_ERROR_NOT_STANDALONE; |
#ifndef XML_DTD |
break; |
#else /* XML_DTD */ |
if (!declEntity) { |
declEntity = (ENTITY *)lookup(parser, |
&dtd->paramEntities, |
externalSubsetName, |
sizeof(ENTITY)); |
if (!declEntity) |
return XML_ERROR_NO_MEMORY; |
declEntity->publicId = NULL; |
} |
/* fall through */ |
#endif /* XML_DTD */ |
case XML_ROLE_ENTITY_SYSTEM_ID: |
if (dtd->keepProcessing && declEntity) { |
declEntity->systemId = poolStoreString(&dtd->pool, enc, |
s + enc->minBytesPerChar, |
next - enc->minBytesPerChar); |
if (!declEntity->systemId) |
return XML_ERROR_NO_MEMORY; |
declEntity->base = curBase; |
poolFinish(&dtd->pool); |
if (entityDeclHandler) |
handleDefault = XML_FALSE; |
} |
break; |
case XML_ROLE_ENTITY_COMPLETE: |
if (dtd->keepProcessing && declEntity && entityDeclHandler) { |
*eventEndPP = s; |
entityDeclHandler(handlerArg, |
declEntity->name, |
declEntity->is_param, |
0,0, |
declEntity->base, |
declEntity->systemId, |
declEntity->publicId, |
0); |
handleDefault = XML_FALSE; |
} |
break; |
case XML_ROLE_ENTITY_NOTATION_NAME: |
if (dtd->keepProcessing && declEntity) { |
declEntity->notation = poolStoreString(&dtd->pool, enc, s, next); |
if (!declEntity->notation) |
return XML_ERROR_NO_MEMORY; |
poolFinish(&dtd->pool); |
if (unparsedEntityDeclHandler) { |
*eventEndPP = s; |
unparsedEntityDeclHandler(handlerArg, |
declEntity->name, |
declEntity->base, |
declEntity->systemId, |
declEntity->publicId, |
declEntity->notation); |
handleDefault = XML_FALSE; |
} |
else if (entityDeclHandler) { |
*eventEndPP = s; |
entityDeclHandler(handlerArg, |
declEntity->name, |
0,0,0, |
declEntity->base, |
declEntity->systemId, |
declEntity->publicId, |
declEntity->notation); |
handleDefault = XML_FALSE; |
} |
} |
break; |
case XML_ROLE_GENERAL_ENTITY_NAME: |
{ |
if (XmlPredefinedEntityName(enc, s, next)) { |
declEntity = NULL; |
break; |
} |
if (dtd->keepProcessing) { |
const XML_Char *name = poolStoreString(&dtd->pool, enc, s, next); |
if (!name) |
return XML_ERROR_NO_MEMORY; |
declEntity = (ENTITY *)lookup(parser, &dtd->generalEntities, name, |
sizeof(ENTITY)); |
if (!declEntity) |
return XML_ERROR_NO_MEMORY; |
if (declEntity->name != name) { |
poolDiscard(&dtd->pool); |
declEntity = NULL; |
} |
else { |
poolFinish(&dtd->pool); |
declEntity->publicId = NULL; |
declEntity->is_param = XML_FALSE; |
/* if we have a parent parser or are reading an internal parameter |
entity, then the entity declaration is not considered "internal" |
*/ |
declEntity->is_internal = !(parentParser || openInternalEntities); |
if (entityDeclHandler) |
handleDefault = XML_FALSE; |
} |
} |
else { |
poolDiscard(&dtd->pool); |
declEntity = NULL; |
} |
} |
break; |
case XML_ROLE_PARAM_ENTITY_NAME: |
#ifdef XML_DTD |
if (dtd->keepProcessing) { |
const XML_Char *name = poolStoreString(&dtd->pool, enc, s, next); |
if (!name) |
return XML_ERROR_NO_MEMORY; |
declEntity = (ENTITY *)lookup(parser, &dtd->paramEntities, |
name, sizeof(ENTITY)); |
if (!declEntity) |
return XML_ERROR_NO_MEMORY; |
if (declEntity->name != name) { |
poolDiscard(&dtd->pool); |
declEntity = NULL; |
} |
else { |
poolFinish(&dtd->pool); |
declEntity->publicId = NULL; |
declEntity->is_param = XML_TRUE; |
/* if we have a parent parser or are reading an internal parameter |
entity, then the entity declaration is not considered "internal" |
*/ |
declEntity->is_internal = !(parentParser || openInternalEntities); |
if (entityDeclHandler) |
handleDefault = XML_FALSE; |
} |
} |
else { |
poolDiscard(&dtd->pool); |
declEntity = NULL; |
} |
#else /* not XML_DTD */ |
declEntity = NULL; |
#endif /* XML_DTD */ |
break; |
case XML_ROLE_NOTATION_NAME: |
declNotationPublicId = NULL; |
declNotationName = NULL; |
if (notationDeclHandler) { |
declNotationName = poolStoreString(&tempPool, enc, s, next); |
if (!declNotationName) |
return XML_ERROR_NO_MEMORY; |
poolFinish(&tempPool); |
handleDefault = XML_FALSE; |
} |
break; |
case XML_ROLE_NOTATION_PUBLIC_ID: |
if (!XmlIsPublicId(enc, s, next, eventPP)) |
return XML_ERROR_PUBLICID; |
if (declNotationName) { /* means notationDeclHandler != NULL */ |
XML_Char *tem = poolStoreString(&tempPool, |
enc, |
s + enc->minBytesPerChar, |
next - enc->minBytesPerChar); |
if (!tem) |
return XML_ERROR_NO_MEMORY; |
normalizePublicId(tem); |
declNotationPublicId = tem; |
poolFinish(&tempPool); |
handleDefault = XML_FALSE; |
} |
break; |
case XML_ROLE_NOTATION_SYSTEM_ID: |
if (declNotationName && notationDeclHandler) { |
const XML_Char *systemId |
= poolStoreString(&tempPool, enc, |
s + enc->minBytesPerChar, |
next - enc->minBytesPerChar); |
if (!systemId) |
return XML_ERROR_NO_MEMORY; |
*eventEndPP = s; |
notationDeclHandler(handlerArg, |
declNotationName, |
curBase, |
systemId, |
declNotationPublicId); |
handleDefault = XML_FALSE; |
} |
poolClear(&tempPool); |
break; |
case XML_ROLE_NOTATION_NO_SYSTEM_ID: |
if (declNotationPublicId && notationDeclHandler) { |
*eventEndPP = s; |
notationDeclHandler(handlerArg, |
declNotationName, |
curBase, |
0, |
declNotationPublicId); |
handleDefault = XML_FALSE; |
} |
poolClear(&tempPool); |
break; |
case XML_ROLE_ERROR: |
switch (tok) { |
case XML_TOK_PARAM_ENTITY_REF: |
/* PE references in internal subset are |
not allowed within declarations. */ |
return XML_ERROR_PARAM_ENTITY_REF; |
case XML_TOK_XML_DECL: |
return XML_ERROR_MISPLACED_XML_PI; |
default: |
return XML_ERROR_SYNTAX; |
} |
#ifdef XML_DTD |
case XML_ROLE_IGNORE_SECT: |
{ |
enum XML_Error result; |
if (defaultHandler) |
reportDefault(parser, enc, s, next); |
handleDefault = XML_FALSE; |
result = doIgnoreSection(parser, enc, &next, end, nextPtr, haveMore); |
if (result != XML_ERROR_NONE) |
return result; |
else if (!next) { |
processor = ignoreSectionProcessor; |
return result; |
} |
} |
break; |
#endif /* XML_DTD */ |
case XML_ROLE_GROUP_OPEN: |
if (prologState.level >= groupSize) { |
if (groupSize) { |
char *temp = (char *)REALLOC(groupConnector, groupSize *= 2); |
if (temp == NULL) |
return XML_ERROR_NO_MEMORY; |
groupConnector = temp; |
if (dtd->scaffIndex) { |
int *temp = (int *)REALLOC(dtd->scaffIndex, |
groupSize * sizeof(int)); |
if (temp == NULL) |
return XML_ERROR_NO_MEMORY; |
dtd->scaffIndex = temp; |
} |
} |
else { |
groupConnector = (char *)MALLOC(groupSize = 32); |
if (!groupConnector) |
return XML_ERROR_NO_MEMORY; |
} |
} |
groupConnector[prologState.level] = 0; |
if (dtd->in_eldecl) { |
int myindex = nextScaffoldPart(parser); |
if (myindex < 0) |
return XML_ERROR_NO_MEMORY; |
dtd->scaffIndex[dtd->scaffLevel] = myindex; |
dtd->scaffLevel++; |
dtd->scaffold[myindex].type = XML_CTYPE_SEQ; |
if (elementDeclHandler) |
handleDefault = XML_FALSE; |
} |
break; |
case XML_ROLE_GROUP_SEQUENCE: |
if (groupConnector[prologState.level] == ASCII_PIPE) |
return XML_ERROR_SYNTAX; |
groupConnector[prologState.level] = ASCII_COMMA; |
if (dtd->in_eldecl && elementDeclHandler) |
handleDefault = XML_FALSE; |
break; |
case XML_ROLE_GROUP_CHOICE: |
if (groupConnector[prologState.level] == ASCII_COMMA) |
return XML_ERROR_SYNTAX; |
if (dtd->in_eldecl |
&& !groupConnector[prologState.level] |
&& (dtd->scaffold[dtd->scaffIndex[dtd->scaffLevel - 1]].type |
!= XML_CTYPE_MIXED) |
) { |
dtd->scaffold[dtd->scaffIndex[dtd->scaffLevel - 1]].type |
= XML_CTYPE_CHOICE; |
if (elementDeclHandler) |
handleDefault = XML_FALSE; |
} |
groupConnector[prologState.level] = ASCII_PIPE; |
break; |
case XML_ROLE_PARAM_ENTITY_REF: |
#ifdef XML_DTD |
case XML_ROLE_INNER_PARAM_ENTITY_REF: |
dtd->hasParamEntityRefs = XML_TRUE; |
if (!paramEntityParsing) |
dtd->keepProcessing = dtd->standalone; |
else { |
const XML_Char *name; |
ENTITY *entity; |
name = poolStoreString(&dtd->pool, enc, |
s + enc->minBytesPerChar, |
next - enc->minBytesPerChar); |
if (!name) |
return XML_ERROR_NO_MEMORY; |
entity = (ENTITY *)lookup(parser, &dtd->paramEntities, name, 0); |
poolDiscard(&dtd->pool); |
/* first, determine if a check for an existing declaration is needed; |
if yes, check that the entity exists, and that it is internal, |
otherwise call the skipped entity handler |
*/ |
if (prologState.documentEntity && |
(dtd->standalone |
? !openInternalEntities |
: !dtd->hasParamEntityRefs)) { |
if (!entity) |
return XML_ERROR_UNDEFINED_ENTITY; |
else if (!entity->is_internal) |
return XML_ERROR_ENTITY_DECLARED_IN_PE; |
} |
else if (!entity) { |
dtd->keepProcessing = dtd->standalone; |
/* cannot report skipped entities in declarations */ |
if ((role == XML_ROLE_PARAM_ENTITY_REF) && skippedEntityHandler) { |
skippedEntityHandler(handlerArg, name, 1); |
handleDefault = XML_FALSE; |
} |
break; |
} |
if (entity->open) |
return XML_ERROR_RECURSIVE_ENTITY_REF; |
if (entity->textPtr) { |
enum XML_Error result; |
XML_Bool betweenDecl = |
(role == XML_ROLE_PARAM_ENTITY_REF ? XML_TRUE : XML_FALSE); |
result = processInternalEntity(parser, entity, betweenDecl); |
if (result != XML_ERROR_NONE) |
return result; |
handleDefault = XML_FALSE; |
break; |
} |
if (externalEntityRefHandler) { |
dtd->paramEntityRead = XML_FALSE; |
entity->open = XML_TRUE; |
if (!externalEntityRefHandler(externalEntityRefHandlerArg, |
0, |
entity->base, |
entity->systemId, |
entity->publicId)) { |
entity->open = XML_FALSE; |
return XML_ERROR_EXTERNAL_ENTITY_HANDLING; |
} |
entity->open = XML_FALSE; |
handleDefault = XML_FALSE; |
if (!dtd->paramEntityRead) { |
dtd->keepProcessing = dtd->standalone; |
break; |
} |
} |
else { |
dtd->keepProcessing = dtd->standalone; |
break; |
} |
} |
#endif /* XML_DTD */ |
if (!dtd->standalone && |
notStandaloneHandler && |
!notStandaloneHandler(handlerArg)) |
return XML_ERROR_NOT_STANDALONE; |
break; |
/* Element declaration stuff */ |
case XML_ROLE_ELEMENT_NAME: |
if (elementDeclHandler) { |
declElementType = getElementType(parser, enc, s, next); |
if (!declElementType) |
return XML_ERROR_NO_MEMORY; |
dtd->scaffLevel = 0; |
dtd->scaffCount = 0; |
dtd->in_eldecl = XML_TRUE; |
handleDefault = XML_FALSE; |
} |
break; |
case XML_ROLE_CONTENT_ANY: |
case XML_ROLE_CONTENT_EMPTY: |
if (dtd->in_eldecl) { |
if (elementDeclHandler) { |
XML_Content * content = (XML_Content *) MALLOC(sizeof(XML_Content)); |
if (!content) |
return XML_ERROR_NO_MEMORY; |
content->quant = XML_CQUANT_NONE; |
content->name = NULL; |
content->numchildren = 0; |
content->children = NULL; |
content->type = ((role == XML_ROLE_CONTENT_ANY) ? |
XML_CTYPE_ANY : |
XML_CTYPE_EMPTY); |
*eventEndPP = s; |
elementDeclHandler(handlerArg, declElementType->name, content); |
handleDefault = XML_FALSE; |
} |
dtd->in_eldecl = XML_FALSE; |
} |
break; |
case XML_ROLE_CONTENT_PCDATA: |
if (dtd->in_eldecl) { |
dtd->scaffold[dtd->scaffIndex[dtd->scaffLevel - 1]].type |
= XML_CTYPE_MIXED; |
if (elementDeclHandler) |
handleDefault = XML_FALSE; |
} |
break; |
case XML_ROLE_CONTENT_ELEMENT: |
quant = XML_CQUANT_NONE; |
goto elementContent; |
case XML_ROLE_CONTENT_ELEMENT_OPT: |
quant = XML_CQUANT_OPT; |
goto elementContent; |
case XML_ROLE_CONTENT_ELEMENT_REP: |
quant = XML_CQUANT_REP; |
goto elementContent; |
case XML_ROLE_CONTENT_ELEMENT_PLUS: |
quant = XML_CQUANT_PLUS; |
elementContent: |
if (dtd->in_eldecl) { |
ELEMENT_TYPE *el; |
const XML_Char *name; |
int nameLen; |
const char *nxt = (quant == XML_CQUANT_NONE |
? next |
: next - enc->minBytesPerChar); |
int myindex = nextScaffoldPart(parser); |
if (myindex < 0) |
return XML_ERROR_NO_MEMORY; |
dtd->scaffold[myindex].type = XML_CTYPE_NAME; |
dtd->scaffold[myindex].quant = quant; |
el = getElementType(parser, enc, s, nxt); |
if (!el) |
return XML_ERROR_NO_MEMORY; |
name = el->name; |
dtd->scaffold[myindex].name = name; |
nameLen = 0; |
for (; name[nameLen++]; ); |
dtd->contentStringLen += nameLen; |
if (elementDeclHandler) |
handleDefault = XML_FALSE; |
} |
break; |
case XML_ROLE_GROUP_CLOSE: |
quant = XML_CQUANT_NONE; |
goto closeGroup; |
case XML_ROLE_GROUP_CLOSE_OPT: |
quant = XML_CQUANT_OPT; |
goto closeGroup; |
case XML_ROLE_GROUP_CLOSE_REP: |
quant = XML_CQUANT_REP; |
goto closeGroup; |
case XML_ROLE_GROUP_CLOSE_PLUS: |
quant = XML_CQUANT_PLUS; |
closeGroup: |
if (dtd->in_eldecl) { |
if (elementDeclHandler) |
handleDefault = XML_FALSE; |
dtd->scaffLevel--; |
dtd->scaffold[dtd->scaffIndex[dtd->scaffLevel]].quant = quant; |
if (dtd->scaffLevel == 0) { |
if (!handleDefault) { |
XML_Content *model = build_model(parser); |
if (!model) |
return XML_ERROR_NO_MEMORY; |
*eventEndPP = s; |
elementDeclHandler(handlerArg, declElementType->name, model); |
} |
dtd->in_eldecl = XML_FALSE; |
dtd->contentStringLen = 0; |
} |
} |
break; |
/* End element declaration stuff */ |
case XML_ROLE_PI: |
if (!reportProcessingInstruction(parser, enc, s, next)) |
return XML_ERROR_NO_MEMORY; |
handleDefault = XML_FALSE; |
break; |
case XML_ROLE_COMMENT: |
if (!reportComment(parser, enc, s, next)) |
return XML_ERROR_NO_MEMORY; |
handleDefault = XML_FALSE; |
break; |
case XML_ROLE_NONE: |
switch (tok) { |
case XML_TOK_BOM: |
handleDefault = XML_FALSE; |
break; |
} |
break; |
case XML_ROLE_DOCTYPE_NONE: |
if (startDoctypeDeclHandler) |
handleDefault = XML_FALSE; |
break; |
case XML_ROLE_ENTITY_NONE: |
if (dtd->keepProcessing && entityDeclHandler) |
handleDefault = XML_FALSE; |
break; |
case XML_ROLE_NOTATION_NONE: |
if (notationDeclHandler) |
handleDefault = XML_FALSE; |
break; |
case XML_ROLE_ATTLIST_NONE: |
if (dtd->keepProcessing && attlistDeclHandler) |
handleDefault = XML_FALSE; |
break; |
case XML_ROLE_ELEMENT_NONE: |
if (elementDeclHandler) |
handleDefault = XML_FALSE; |
break; |
} /* end of big switch */ |
if (handleDefault && defaultHandler) |
reportDefault(parser, enc, s, next); |
switch (ps_parsing) { |
case XML_SUSPENDED: |
*nextPtr = next; |
return XML_ERROR_NONE; |
case XML_FINISHED: |
return XML_ERROR_ABORTED; |
default: |
s = next; |
tok = XmlPrologTok(enc, s, end, &next); |
} |
} |
/* not reached */ |
} |
static enum XML_Error PTRCALL |
epilogProcessor(XML_Parser parser, |
const char *s, |
const char *end, |
const char **nextPtr) |
{ |
processor = epilogProcessor; |
eventPtr = s; |
for (;;) { |
const char *next = NULL; |
int tok = XmlPrologTok(encoding, s, end, &next); |
eventEndPtr = next; |
switch (tok) { |
/* report partial linebreak - it might be the last token */ |
case -XML_TOK_PROLOG_S: |
if (defaultHandler) { |
reportDefault(parser, encoding, s, next); |
if (ps_parsing == XML_FINISHED) |
return XML_ERROR_ABORTED; |
} |
*nextPtr = next; |
return XML_ERROR_NONE; |
case XML_TOK_NONE: |
*nextPtr = s; |
return XML_ERROR_NONE; |
case XML_TOK_PROLOG_S: |
if (defaultHandler) |
reportDefault(parser, encoding, s, next); |
break; |
case XML_TOK_PI: |
if (!reportProcessingInstruction(parser, encoding, s, next)) |
return XML_ERROR_NO_MEMORY; |
break; |
case XML_TOK_COMMENT: |
if (!reportComment(parser, encoding, s, next)) |
return XML_ERROR_NO_MEMORY; |
break; |
case XML_TOK_INVALID: |
eventPtr = next; |
return XML_ERROR_INVALID_TOKEN; |
case XML_TOK_PARTIAL: |
if (!ps_finalBuffer) { |
*nextPtr = s; |
return XML_ERROR_NONE; |
} |
return XML_ERROR_UNCLOSED_TOKEN; |
case XML_TOK_PARTIAL_CHAR: |
if (!ps_finalBuffer) { |
*nextPtr = s; |
return XML_ERROR_NONE; |
} |
return XML_ERROR_PARTIAL_CHAR; |
default: |
return XML_ERROR_JUNK_AFTER_DOC_ELEMENT; |
} |
eventPtr = s = next; |
switch (ps_parsing) { |
case XML_SUSPENDED: |
*nextPtr = next; |
return XML_ERROR_NONE; |
case XML_FINISHED: |
return XML_ERROR_ABORTED; |
default: ; |
} |
} |
} |
static enum XML_Error |
processInternalEntity(XML_Parser parser, ENTITY *entity, |
XML_Bool betweenDecl) |
{ |
const char *textStart, *textEnd; |
const char *next; |
enum XML_Error result; |
OPEN_INTERNAL_ENTITY *openEntity; |
if (freeInternalEntities) { |
openEntity = freeInternalEntities; |
freeInternalEntities = openEntity->next; |
} |
else { |
openEntity = (OPEN_INTERNAL_ENTITY *)MALLOC(sizeof(OPEN_INTERNAL_ENTITY)); |
if (!openEntity) |
return XML_ERROR_NO_MEMORY; |
} |
entity->open = XML_TRUE; |
entity->processed = 0; |
openEntity->next = openInternalEntities; |
openInternalEntities = openEntity; |
openEntity->entity = entity; |
openEntity->startTagLevel = tagLevel; |
openEntity->betweenDecl = betweenDecl; |
openEntity->internalEventPtr = NULL; |
openEntity->internalEventEndPtr = NULL; |
textStart = (char *)entity->textPtr; |
textEnd = (char *)(entity->textPtr + entity->textLen); |
#ifdef XML_DTD |
if (entity->is_param) { |
int tok = XmlPrologTok(internalEncoding, textStart, textEnd, &next); |
result = doProlog(parser, internalEncoding, textStart, textEnd, tok, |
next, &next, XML_FALSE); |
} |
else |
#endif /* XML_DTD */ |
result = doContent(parser, tagLevel, internalEncoding, textStart, |
textEnd, &next, XML_FALSE); |
if (result == XML_ERROR_NONE) { |
if (textEnd != next && ps_parsing == XML_SUSPENDED) { |
entity->processed = (int)(next - textStart); |
processor = internalEntityProcessor; |
} |
else { |
entity->open = XML_FALSE; |
openInternalEntities = openEntity->next; |
/* put openEntity back in list of free instances */ |
openEntity->next = freeInternalEntities; |
freeInternalEntities = openEntity; |
} |
} |
return result; |
} |
static enum XML_Error PTRCALL |
internalEntityProcessor(XML_Parser parser, |
const char *s, |
const char *end, |
const char **nextPtr) |
{ |
ENTITY *entity; |
const char *textStart, *textEnd; |
const char *next; |
enum XML_Error result; |
OPEN_INTERNAL_ENTITY *openEntity = openInternalEntities; |
if (!openEntity) |
return XML_ERROR_UNEXPECTED_STATE; |
entity = openEntity->entity; |
textStart = ((char *)entity->textPtr) + entity->processed; |
textEnd = (char *)(entity->textPtr + entity->textLen); |
#ifdef XML_DTD |
if (entity->is_param) { |
int tok = XmlPrologTok(internalEncoding, textStart, textEnd, &next); |
result = doProlog(parser, internalEncoding, textStart, textEnd, tok, |
next, &next, XML_FALSE); |
} |
else |
#endif /* XML_DTD */ |
result = doContent(parser, openEntity->startTagLevel, internalEncoding, |
textStart, textEnd, &next, XML_FALSE); |
if (result != XML_ERROR_NONE) |
return result; |
else if (textEnd != next && ps_parsing == XML_SUSPENDED) { |
entity->processed = (int)(next - (char *)entity->textPtr); |
return result; |
} |
else { |
entity->open = XML_FALSE; |
openInternalEntities = openEntity->next; |
/* put openEntity back in list of free instances */ |
openEntity->next = freeInternalEntities; |
freeInternalEntities = openEntity; |
} |
#ifdef XML_DTD |
if (entity->is_param) { |
int tok; |
processor = prologProcessor; |
tok = XmlPrologTok(encoding, s, end, &next); |
return doProlog(parser, encoding, s, end, tok, next, nextPtr, |
(XML_Bool)!ps_finalBuffer); |
} |
else |
#endif /* XML_DTD */ |
{ |
processor = contentProcessor; |
/* see externalEntityContentProcessor vs contentProcessor */ |
return doContent(parser, parentParser ? 1 : 0, encoding, s, end, |
nextPtr, (XML_Bool)!ps_finalBuffer); |
} |
} |
static enum XML_Error PTRCALL |
errorProcessor(XML_Parser parser, |
const char *s, |
const char *end, |
const char **nextPtr) |
{ |
return errorCode; |
} |
static enum XML_Error |
storeAttributeValue(XML_Parser parser, const ENCODING *enc, XML_Bool isCdata, |
const char *ptr, const char *end, |
STRING_POOL *pool) |
{ |
enum XML_Error result = appendAttributeValue(parser, enc, isCdata, ptr, |
end, pool); |
if (result) |
return result; |
if (!isCdata && poolLength(pool) && poolLastChar(pool) == 0x20) |
poolChop(pool); |
if (!poolAppendChar(pool, XML_T('\0'))) |
return XML_ERROR_NO_MEMORY; |
return XML_ERROR_NONE; |
} |
static enum XML_Error |
appendAttributeValue(XML_Parser parser, const ENCODING *enc, XML_Bool isCdata, |
const char *ptr, const char *end, |
STRING_POOL *pool) |
{ |
DTD * const dtd = _dtd; /* save one level of indirection */ |
for (;;) { |
const char *next; |
int tok = XmlAttributeValueTok(enc, ptr, end, &next); |
switch (tok) { |
case XML_TOK_NONE: |
return XML_ERROR_NONE; |
case XML_TOK_INVALID: |
if (enc == encoding) |
eventPtr = next; |
return XML_ERROR_INVALID_TOKEN; |
case XML_TOK_PARTIAL: |
if (enc == encoding) |
eventPtr = ptr; |
return XML_ERROR_INVALID_TOKEN; |
case XML_TOK_CHAR_REF: |
{ |
XML_Char buf[XML_ENCODE_MAX]; |
int i; |
int n = XmlCharRefNumber(enc, ptr); |
if (n < 0) { |
if (enc == encoding) |
eventPtr = ptr; |
return XML_ERROR_BAD_CHAR_REF; |
} |
if (!isCdata |
&& n == 0x20 /* space */ |
&& (poolLength(pool) == 0 || poolLastChar(pool) == 0x20)) |
break; |
n = XmlEncode(n, (ICHAR *)buf); |
if (!n) { |
if (enc == encoding) |
eventPtr = ptr; |
return XML_ERROR_BAD_CHAR_REF; |
} |
for (i = 0; i < n; i++) { |
if (!poolAppendChar(pool, buf[i])) |
return XML_ERROR_NO_MEMORY; |
} |
} |
break; |
case XML_TOK_DATA_CHARS: |
if (!poolAppend(pool, enc, ptr, next)) |
return XML_ERROR_NO_MEMORY; |
break; |
case XML_TOK_TRAILING_CR: |
next = ptr + enc->minBytesPerChar; |
/* fall through */ |
case XML_TOK_ATTRIBUTE_VALUE_S: |
case XML_TOK_DATA_NEWLINE: |
if (!isCdata && (poolLength(pool) == 0 || poolLastChar(pool) == 0x20)) |
break; |
if (!poolAppendChar(pool, 0x20)) |
return XML_ERROR_NO_MEMORY; |
break; |
case XML_TOK_ENTITY_REF: |
{ |
const XML_Char *name; |
ENTITY *entity; |
char checkEntityDecl; |
XML_Char ch = (XML_Char) XmlPredefinedEntityName(enc, |
ptr + enc->minBytesPerChar, |
next - enc->minBytesPerChar); |
if (ch) { |
if (!poolAppendChar(pool, ch)) |
return XML_ERROR_NO_MEMORY; |
break; |
} |
name = poolStoreString(&temp2Pool, enc, |
ptr + enc->minBytesPerChar, |
next - enc->minBytesPerChar); |
if (!name) |
return XML_ERROR_NO_MEMORY; |
entity = (ENTITY *)lookup(parser, &dtd->generalEntities, name, 0); |
poolDiscard(&temp2Pool); |
/* First, determine if a check for an existing declaration is needed; |
if yes, check that the entity exists, and that it is internal. |
*/ |
if (pool == &dtd->pool) /* are we called from prolog? */ |
checkEntityDecl = |
#ifdef XML_DTD |
prologState.documentEntity && |
#endif /* XML_DTD */ |
(dtd->standalone |
? !openInternalEntities |
: !dtd->hasParamEntityRefs); |
else /* if (pool == &tempPool): we are called from content */ |
checkEntityDecl = !dtd->hasParamEntityRefs || dtd->standalone; |
if (checkEntityDecl) { |
if (!entity) |
return XML_ERROR_UNDEFINED_ENTITY; |
else if (!entity->is_internal) |
return XML_ERROR_ENTITY_DECLARED_IN_PE; |
} |
else if (!entity) { |
/* Cannot report skipped entity here - see comments on |
skippedEntityHandler. |
if (skippedEntityHandler) |
skippedEntityHandler(handlerArg, name, 0); |
*/ |
/* Cannot call the default handler because this would be |
out of sync with the call to the startElementHandler. |
if ((pool == &tempPool) && defaultHandler) |
reportDefault(parser, enc, ptr, next); |
*/ |
break; |
} |
if (entity->open) { |
if (enc == encoding) |
eventPtr = ptr; |
return XML_ERROR_RECURSIVE_ENTITY_REF; |
} |
if (entity->notation) { |
if (enc == encoding) |
eventPtr = ptr; |
return XML_ERROR_BINARY_ENTITY_REF; |
} |
if (!entity->textPtr) { |
if (enc == encoding) |
eventPtr = ptr; |
return XML_ERROR_ATTRIBUTE_EXTERNAL_ENTITY_REF; |
} |
else { |
enum XML_Error result; |
const XML_Char *textEnd = entity->textPtr + entity->textLen; |
entity->open = XML_TRUE; |
result = appendAttributeValue(parser, internalEncoding, isCdata, |
(char *)entity->textPtr, |
(char *)textEnd, pool); |
entity->open = XML_FALSE; |
if (result) |
return result; |
} |
} |
break; |
default: |
if (enc == encoding) |
eventPtr = ptr; |
return XML_ERROR_UNEXPECTED_STATE; |
} |
ptr = next; |
} |
/* not reached */ |
} |
static enum XML_Error |
storeEntityValue(XML_Parser parser, |
const ENCODING *enc, |
const char *entityTextPtr, |
const char *entityTextEnd) |
{ |
DTD * const dtd = _dtd; /* save one level of indirection */ |
STRING_POOL *pool = &(dtd->entityValuePool); |
enum XML_Error result = XML_ERROR_NONE; |
#ifdef XML_DTD |
int oldInEntityValue = prologState.inEntityValue; |
prologState.inEntityValue = 1; |
#endif /* XML_DTD */ |
/* never return Null for the value argument in EntityDeclHandler, |
since this would indicate an external entity; therefore we |
have to make sure that entityValuePool.start is not null */ |
if (!pool->blocks) { |
if (!poolGrow(pool)) |
return XML_ERROR_NO_MEMORY; |
} |
for (;;) { |
const char *next; |
int tok = XmlEntityValueTok(enc, entityTextPtr, entityTextEnd, &next); |
switch (tok) { |
case XML_TOK_PARAM_ENTITY_REF: |
#ifdef XML_DTD |
if (isParamEntity || enc != encoding) { |
const XML_Char *name; |
ENTITY *entity; |
name = poolStoreString(&tempPool, enc, |
entityTextPtr + enc->minBytesPerChar, |
next - enc->minBytesPerChar); |
if (!name) { |
result = XML_ERROR_NO_MEMORY; |
goto endEntityValue; |
} |
entity = (ENTITY *)lookup(parser, &dtd->paramEntities, name, 0); |
poolDiscard(&tempPool); |
if (!entity) { |
/* not a well-formedness error - see XML 1.0: WFC Entity Declared */ |
/* cannot report skipped entity here - see comments on |
skippedEntityHandler |
if (skippedEntityHandler) |
skippedEntityHandler(handlerArg, name, 0); |
*/ |
dtd->keepProcessing = dtd->standalone; |
goto endEntityValue; |
} |
if (entity->open) { |
if (enc == encoding) |
eventPtr = entityTextPtr; |
result = XML_ERROR_RECURSIVE_ENTITY_REF; |
goto endEntityValue; |
} |
if (entity->systemId) { |
if (externalEntityRefHandler) { |
dtd->paramEntityRead = XML_FALSE; |
entity->open = XML_TRUE; |
if (!externalEntityRefHandler(externalEntityRefHandlerArg, |
0, |
entity->base, |
entity->systemId, |
entity->publicId)) { |
entity->open = XML_FALSE; |
result = XML_ERROR_EXTERNAL_ENTITY_HANDLING; |
goto endEntityValue; |
} |
entity->open = XML_FALSE; |
if (!dtd->paramEntityRead) |
dtd->keepProcessing = dtd->standalone; |
} |
else |
dtd->keepProcessing = dtd->standalone; |
} |
else { |
entity->open = XML_TRUE; |
result = storeEntityValue(parser, |
internalEncoding, |
(char *)entity->textPtr, |
(char *)(entity->textPtr |
+ entity->textLen)); |
entity->open = XML_FALSE; |
if (result) |
goto endEntityValue; |
} |
break; |
} |
#endif /* XML_DTD */ |
/* In the internal subset, PE references are not legal |
within markup declarations, e.g entity values in this case. */ |
eventPtr = entityTextPtr; |
result = XML_ERROR_PARAM_ENTITY_REF; |
goto endEntityValue; |
case XML_TOK_NONE: |
result = XML_ERROR_NONE; |
goto endEntityValue; |
case XML_TOK_ENTITY_REF: |
case XML_TOK_DATA_CHARS: |
if (!poolAppend(pool, enc, entityTextPtr, next)) { |
result = XML_ERROR_NO_MEMORY; |
goto endEntityValue; |
} |
break; |
case XML_TOK_TRAILING_CR: |
next = entityTextPtr + enc->minBytesPerChar; |
/* fall through */ |
case XML_TOK_DATA_NEWLINE: |
if (pool->end == pool->ptr && !poolGrow(pool)) { |
result = XML_ERROR_NO_MEMORY; |
goto endEntityValue; |
} |
*(pool->ptr)++ = 0xA; |
break; |
case XML_TOK_CHAR_REF: |
{ |
XML_Char buf[XML_ENCODE_MAX]; |
int i; |
int n = XmlCharRefNumber(enc, entityTextPtr); |
if (n < 0) { |
if (enc == encoding) |
eventPtr = entityTextPtr; |
result = XML_ERROR_BAD_CHAR_REF; |
goto endEntityValue; |
} |
n = XmlEncode(n, (ICHAR *)buf); |
if (!n) { |
if (enc == encoding) |
eventPtr = entityTextPtr; |
result = XML_ERROR_BAD_CHAR_REF; |
goto endEntityValue; |
} |
for (i = 0; i < n; i++) { |
if (pool->end == pool->ptr && !poolGrow(pool)) { |
result = XML_ERROR_NO_MEMORY; |
goto endEntityValue; |
} |
*(pool->ptr)++ = buf[i]; |
} |
} |
break; |
case XML_TOK_PARTIAL: |
if (enc == encoding) |
eventPtr = entityTextPtr; |
result = XML_ERROR_INVALID_TOKEN; |
goto endEntityValue; |
case XML_TOK_INVALID: |
if (enc == encoding) |
eventPtr = next; |
result = XML_ERROR_INVALID_TOKEN; |
goto endEntityValue; |
default: |
if (enc == encoding) |
eventPtr = entityTextPtr; |
result = XML_ERROR_UNEXPECTED_STATE; |
goto endEntityValue; |
} |
entityTextPtr = next; |
} |
endEntityValue: |
#ifdef XML_DTD |
prologState.inEntityValue = oldInEntityValue; |
#endif /* XML_DTD */ |
return result; |
} |
static void FASTCALL |
normalizeLines(XML_Char *s) |
{ |
XML_Char *p; |
for (;; s++) { |
if (*s == XML_T('\0')) |
return; |
if (*s == 0xD) |
break; |
} |
p = s; |
do { |
if (*s == 0xD) { |
*p++ = 0xA; |
if (*++s == 0xA) |
s++; |
} |
else |
*p++ = *s++; |
} while (*s); |
*p = XML_T('\0'); |
} |
static int |
reportProcessingInstruction(XML_Parser parser, const ENCODING *enc, |
const char *start, const char *end) |
{ |
const XML_Char *target; |
XML_Char *data; |
const char *tem; |
if (!processingInstructionHandler) { |
if (defaultHandler) |
reportDefault(parser, enc, start, end); |
return 1; |
} |
start += enc->minBytesPerChar * 2; |
tem = start + XmlNameLength(enc, start); |
target = poolStoreString(&tempPool, enc, start, tem); |
if (!target) |
return 0; |
poolFinish(&tempPool); |
data = poolStoreString(&tempPool, enc, |
XmlSkipS(enc, tem), |
end - enc->minBytesPerChar*2); |
if (!data) |
return 0; |
normalizeLines(data); |
processingInstructionHandler(handlerArg, target, data); |
poolClear(&tempPool); |
return 1; |
} |
static int |
reportComment(XML_Parser parser, const ENCODING *enc, |
const char *start, const char *end) |
{ |
XML_Char *data; |
if (!commentHandler) { |
if (defaultHandler) |
reportDefault(parser, enc, start, end); |
return 1; |
} |
data = poolStoreString(&tempPool, |
enc, |
start + enc->minBytesPerChar * 4, |
end - enc->minBytesPerChar * 3); |
if (!data) |
return 0; |
normalizeLines(data); |
commentHandler(handlerArg, data); |
poolClear(&tempPool); |
return 1; |
} |
static void |
reportDefault(XML_Parser parser, const ENCODING *enc, |
const char *s, const char *end) |
{ |
if (MUST_CONVERT(enc, s)) { |
const char **eventPP; |
const char **eventEndPP; |
if (enc == encoding) { |
eventPP = &eventPtr; |
eventEndPP = &eventEndPtr; |
} |
else { |
eventPP = &(openInternalEntities->internalEventPtr); |
eventEndPP = &(openInternalEntities->internalEventEndPtr); |
} |
do { |
ICHAR *dataPtr = (ICHAR *)dataBuf; |
XmlConvert(enc, &s, end, &dataPtr, (ICHAR *)dataBufEnd); |
*eventEndPP = s; |
defaultHandler(handlerArg, dataBuf, (int)(dataPtr - (ICHAR *)dataBuf)); |
*eventPP = s; |
} while (s != end); |
} |
else |
defaultHandler(handlerArg, (XML_Char *)s, (int)((XML_Char *)end - (XML_Char *)s)); |
} |
static int |
defineAttribute(ELEMENT_TYPE *type, ATTRIBUTE_ID *attId, XML_Bool isCdata, |
XML_Bool isId, const XML_Char *value, XML_Parser parser) |
{ |
DEFAULT_ATTRIBUTE *att; |
if (value || isId) { |
/* The handling of default attributes gets messed up if we have |
a default which duplicates a non-default. */ |
int i; |
for (i = 0; i < type->nDefaultAtts; i++) |
if (attId == type->defaultAtts[i].id) |
return 1; |
if (isId && !type->idAtt && !attId->xmlns) |
type->idAtt = attId; |
} |
if (type->nDefaultAtts == type->allocDefaultAtts) { |
if (type->allocDefaultAtts == 0) { |
type->allocDefaultAtts = 8; |
type->defaultAtts = (DEFAULT_ATTRIBUTE *)MALLOC(type->allocDefaultAtts |
* sizeof(DEFAULT_ATTRIBUTE)); |
if (!type->defaultAtts) |
return 0; |
} |
else { |
DEFAULT_ATTRIBUTE *temp; |
int count = type->allocDefaultAtts * 2; |
temp = (DEFAULT_ATTRIBUTE *) |
REALLOC(type->defaultAtts, (count * sizeof(DEFAULT_ATTRIBUTE))); |
if (temp == NULL) |
return 0; |
type->allocDefaultAtts = count; |
type->defaultAtts = temp; |
} |
} |
att = type->defaultAtts + type->nDefaultAtts; |
att->id = attId; |
att->value = value; |
att->isCdata = isCdata; |
if (!isCdata) |
attId->maybeTokenized = XML_TRUE; |
type->nDefaultAtts += 1; |
return 1; |
} |
static int |
setElementTypePrefix(XML_Parser parser, ELEMENT_TYPE *elementType) |
{ |
DTD * const dtd = _dtd; /* save one level of indirection */ |
const XML_Char *name; |
for (name = elementType->name; *name; name++) { |
if (*name == XML_T(ASCII_COLON)) { |
PREFIX *prefix; |
const XML_Char *s; |
for (s = elementType->name; s != name; s++) { |
if (!poolAppendChar(&dtd->pool, *s)) |
return 0; |
} |
if (!poolAppendChar(&dtd->pool, XML_T('\0'))) |
return 0; |
prefix = (PREFIX *)lookup(parser, &dtd->prefixes, poolStart(&dtd->pool), |
sizeof(PREFIX)); |
if (!prefix) |
return 0; |
if (prefix->name == poolStart(&dtd->pool)) |
poolFinish(&dtd->pool); |
else |
poolDiscard(&dtd->pool); |
elementType->prefix = prefix; |
} |
} |
return 1; |
} |
static ATTRIBUTE_ID * |
getAttributeId(XML_Parser parser, const ENCODING *enc, |
const char *start, const char *end) |
{ |
DTD * const dtd = _dtd; /* save one level of indirection */ |
ATTRIBUTE_ID *id; |
const XML_Char *name; |
if (!poolAppendChar(&dtd->pool, XML_T('\0'))) |
return NULL; |
name = poolStoreString(&dtd->pool, enc, start, end); |
if (!name) |
return NULL; |
/* skip quotation mark - its storage will be re-used (like in name[-1]) */ |
++name; |
id = (ATTRIBUTE_ID *)lookup(parser, &dtd->attributeIds, name, sizeof(ATTRIBUTE_ID)); |
if (!id) |
return NULL; |
if (id->name != name) |
poolDiscard(&dtd->pool); |
else { |
poolFinish(&dtd->pool); |
if (!ns) |
; |
else if (name[0] == XML_T(ASCII_x) |
&& name[1] == XML_T(ASCII_m) |
&& name[2] == XML_T(ASCII_l) |
&& name[3] == XML_T(ASCII_n) |
&& name[4] == XML_T(ASCII_s) |
&& (name[5] == XML_T('\0') || name[5] == XML_T(ASCII_COLON))) { |
if (name[5] == XML_T('\0')) |
id->prefix = &dtd->defaultPrefix; |
else |
id->prefix = (PREFIX *)lookup(parser, &dtd->prefixes, name + 6, sizeof(PREFIX)); |
id->xmlns = XML_TRUE; |
} |
else { |
int i; |
for (i = 0; name[i]; i++) { |
/* attributes without prefix are *not* in the default namespace */ |
if (name[i] == XML_T(ASCII_COLON)) { |
int j; |
for (j = 0; j < i; j++) { |
if (!poolAppendChar(&dtd->pool, name[j])) |
return NULL; |
} |
if (!poolAppendChar(&dtd->pool, XML_T('\0'))) |
return NULL; |
id->prefix = (PREFIX *)lookup(parser, &dtd->prefixes, poolStart(&dtd->pool), |
sizeof(PREFIX)); |
if (id->prefix->name == poolStart(&dtd->pool)) |
poolFinish(&dtd->pool); |
else |
poolDiscard(&dtd->pool); |
break; |
} |
} |
} |
} |
return id; |
} |
#define CONTEXT_SEP XML_T(ASCII_FF) |
static const XML_Char * |
getContext(XML_Parser parser) |
{ |
DTD * const dtd = _dtd; /* save one level of indirection */ |
HASH_TABLE_ITER iter; |
XML_Bool needSep = XML_FALSE; |
if (dtd->defaultPrefix.binding) { |
int i; |
int len; |
if (!poolAppendChar(&tempPool, XML_T(ASCII_EQUALS))) |
return NULL; |
len = dtd->defaultPrefix.binding->uriLen; |
if (namespaceSeparator) |
len--; |
for (i = 0; i < len; i++) |
if (!poolAppendChar(&tempPool, dtd->defaultPrefix.binding->uri[i])) |
return NULL; |
needSep = XML_TRUE; |
} |
hashTableIterInit(&iter, &(dtd->prefixes)); |
for (;;) { |
int i; |
int len; |
const XML_Char *s; |
PREFIX *prefix = (PREFIX *)hashTableIterNext(&iter); |
if (!prefix) |
break; |
if (!prefix->binding) |
continue; |
if (needSep && !poolAppendChar(&tempPool, CONTEXT_SEP)) |
return NULL; |
for (s = prefix->name; *s; s++) |
if (!poolAppendChar(&tempPool, *s)) |
return NULL; |
if (!poolAppendChar(&tempPool, XML_T(ASCII_EQUALS))) |
return NULL; |
len = prefix->binding->uriLen; |
if (namespaceSeparator) |
len--; |
for (i = 0; i < len; i++) |
if (!poolAppendChar(&tempPool, prefix->binding->uri[i])) |
return NULL; |
needSep = XML_TRUE; |
} |
hashTableIterInit(&iter, &(dtd->generalEntities)); |
for (;;) { |
const XML_Char *s; |
ENTITY *e = (ENTITY *)hashTableIterNext(&iter); |
if (!e) |
break; |
if (!e->open) |
continue; |
if (needSep && !poolAppendChar(&tempPool, CONTEXT_SEP)) |
return NULL; |
for (s = e->name; *s; s++) |
if (!poolAppendChar(&tempPool, *s)) |
return 0; |
needSep = XML_TRUE; |
} |
if (!poolAppendChar(&tempPool, XML_T('\0'))) |
return NULL; |
return tempPool.start; |
} |
static XML_Bool |
setContext(XML_Parser parser, const XML_Char *context) |
{ |
DTD * const dtd = _dtd; /* save one level of indirection */ |
const XML_Char *s = context; |
while (*context != XML_T('\0')) { |
if (*s == CONTEXT_SEP || *s == XML_T('\0')) { |
ENTITY *e; |
if (!poolAppendChar(&tempPool, XML_T('\0'))) |
return XML_FALSE; |
e = (ENTITY *)lookup(parser, &dtd->generalEntities, poolStart(&tempPool), 0); |
if (e) |
e->open = XML_TRUE; |
if (*s != XML_T('\0')) |
s++; |
context = s; |
poolDiscard(&tempPool); |
} |
else if (*s == XML_T(ASCII_EQUALS)) { |
PREFIX *prefix; |
if (poolLength(&tempPool) == 0) |
prefix = &dtd->defaultPrefix; |
else { |
if (!poolAppendChar(&tempPool, XML_T('\0'))) |
return XML_FALSE; |
prefix = (PREFIX *)lookup(parser, &dtd->prefixes, poolStart(&tempPool), |
sizeof(PREFIX)); |
if (!prefix) |
return XML_FALSE; |
if (prefix->name == poolStart(&tempPool)) { |
prefix->name = poolCopyString(&dtd->pool, prefix->name); |
if (!prefix->name) |
return XML_FALSE; |
} |
poolDiscard(&tempPool); |
} |
for (context = s + 1; |
*context != CONTEXT_SEP && *context != XML_T('\0'); |
context++) |
if (!poolAppendChar(&tempPool, *context)) |
return XML_FALSE; |
if (!poolAppendChar(&tempPool, XML_T('\0'))) |
return XML_FALSE; |
if (addBinding(parser, prefix, NULL, poolStart(&tempPool), |
&inheritedBindings) != XML_ERROR_NONE) |
return XML_FALSE; |
poolDiscard(&tempPool); |
if (*context != XML_T('\0')) |
++context; |
s = context; |
} |
else { |
if (!poolAppendChar(&tempPool, *s)) |
return XML_FALSE; |
s++; |
} |
} |
return XML_TRUE; |
} |
static void FASTCALL |
normalizePublicId(XML_Char *publicId) |
{ |
XML_Char *p = publicId; |
XML_Char *s; |
for (s = publicId; *s; s++) { |
switch (*s) { |
case 0x20: |
case 0xD: |
case 0xA: |
if (p != publicId && p[-1] != 0x20) |
*p++ = 0x20; |
break; |
default: |
*p++ = *s; |
} |
} |
if (p != publicId && p[-1] == 0x20) |
--p; |
*p = XML_T('\0'); |
} |
static DTD * |
dtdCreate(const XML_Memory_Handling_Suite *ms) |
{ |
DTD *p = (DTD *)ms->malloc_fcn(sizeof(DTD)); |
if (p == NULL) |
return p; |
poolInit(&(p->pool), ms); |
poolInit(&(p->entityValuePool), ms); |
hashTableInit(&(p->generalEntities), ms); |
hashTableInit(&(p->elementTypes), ms); |
hashTableInit(&(p->attributeIds), ms); |
hashTableInit(&(p->prefixes), ms); |
#ifdef XML_DTD |
p->paramEntityRead = XML_FALSE; |
hashTableInit(&(p->paramEntities), ms); |
#endif /* XML_DTD */ |
p->defaultPrefix.name = NULL; |
p->defaultPrefix.binding = NULL; |
p->in_eldecl = XML_FALSE; |
p->scaffIndex = NULL; |
p->scaffold = NULL; |
p->scaffLevel = 0; |
p->scaffSize = 0; |
p->scaffCount = 0; |
p->contentStringLen = 0; |
p->keepProcessing = XML_TRUE; |
p->hasParamEntityRefs = XML_FALSE; |
p->standalone = XML_FALSE; |
return p; |
} |
static void |
dtdReset(DTD *p, const XML_Memory_Handling_Suite *ms) |
{ |
HASH_TABLE_ITER iter; |
hashTableIterInit(&iter, &(p->elementTypes)); |
for (;;) { |
ELEMENT_TYPE *e = (ELEMENT_TYPE *)hashTableIterNext(&iter); |
if (!e) |
break; |
if (e->allocDefaultAtts != 0) |
ms->free_fcn(e->defaultAtts); |
} |
hashTableClear(&(p->generalEntities)); |
#ifdef XML_DTD |
p->paramEntityRead = XML_FALSE; |
hashTableClear(&(p->paramEntities)); |
#endif /* XML_DTD */ |
hashTableClear(&(p->elementTypes)); |
hashTableClear(&(p->attributeIds)); |
hashTableClear(&(p->prefixes)); |
poolClear(&(p->pool)); |
poolClear(&(p->entityValuePool)); |
p->defaultPrefix.name = NULL; |
p->defaultPrefix.binding = NULL; |
p->in_eldecl = XML_FALSE; |
ms->free_fcn(p->scaffIndex); |
p->scaffIndex = NULL; |
ms->free_fcn(p->scaffold); |
p->scaffold = NULL; |
p->scaffLevel = 0; |
p->scaffSize = 0; |
p->scaffCount = 0; |
p->contentStringLen = 0; |
p->keepProcessing = XML_TRUE; |
p->hasParamEntityRefs = XML_FALSE; |
p->standalone = XML_FALSE; |
} |
static void |
dtdDestroy(DTD *p, XML_Bool isDocEntity, const XML_Memory_Handling_Suite *ms) |
{ |
HASH_TABLE_ITER iter; |
hashTableIterInit(&iter, &(p->elementTypes)); |
for (;;) { |
ELEMENT_TYPE *e = (ELEMENT_TYPE *)hashTableIterNext(&iter); |
if (!e) |
break; |
if (e->allocDefaultAtts != 0) |
ms->free_fcn(e->defaultAtts); |
} |
hashTableDestroy(&(p->generalEntities)); |
#ifdef XML_DTD |
hashTableDestroy(&(p->paramEntities)); |
#endif /* XML_DTD */ |
hashTableDestroy(&(p->elementTypes)); |
hashTableDestroy(&(p->attributeIds)); |
hashTableDestroy(&(p->prefixes)); |
poolDestroy(&(p->pool)); |
poolDestroy(&(p->entityValuePool)); |
if (isDocEntity) { |
ms->free_fcn(p->scaffIndex); |
ms->free_fcn(p->scaffold); |
} |
ms->free_fcn(p); |
} |
/* Do a deep copy of the DTD. Return 0 for out of memory, non-zero otherwise. |
The new DTD has already been initialized. |
*/ |
static int |
dtdCopy(XML_Parser oldParser, DTD *newDtd, const DTD *oldDtd, const XML_Memory_Handling_Suite *ms) |
{ |
HASH_TABLE_ITER iter; |
/* Copy the prefix table. */ |
hashTableIterInit(&iter, &(oldDtd->prefixes)); |
for (;;) { |
const XML_Char *name; |
const PREFIX *oldP = (PREFIX *)hashTableIterNext(&iter); |
if (!oldP) |
break; |
name = poolCopyString(&(newDtd->pool), oldP->name); |
if (!name) |
return 0; |
if (!lookup(oldParser, &(newDtd->prefixes), name, sizeof(PREFIX))) |
return 0; |
} |
hashTableIterInit(&iter, &(oldDtd->attributeIds)); |
/* Copy the attribute id table. */ |
for (;;) { |
ATTRIBUTE_ID *newA; |
const XML_Char *name; |
const ATTRIBUTE_ID *oldA = (ATTRIBUTE_ID *)hashTableIterNext(&iter); |
if (!oldA) |
break; |
/* Remember to allocate the scratch byte before the name. */ |
if (!poolAppendChar(&(newDtd->pool), XML_T('\0'))) |
return 0; |
name = poolCopyString(&(newDtd->pool), oldA->name); |
if (!name) |
return 0; |
++name; |
newA = (ATTRIBUTE_ID *)lookup(oldParser, &(newDtd->attributeIds), name, |
sizeof(ATTRIBUTE_ID)); |
if (!newA) |
return 0; |
newA->maybeTokenized = oldA->maybeTokenized; |
if (oldA->prefix) { |
newA->xmlns = oldA->xmlns; |
if (oldA->prefix == &oldDtd->defaultPrefix) |
newA->prefix = &newDtd->defaultPrefix; |
else |
newA->prefix = (PREFIX *)lookup(oldParser, &(newDtd->prefixes), |
oldA->prefix->name, 0); |
} |
} |
/* Copy the element type table. */ |
hashTableIterInit(&iter, &(oldDtd->elementTypes)); |
for (;;) { |
int i; |
ELEMENT_TYPE *newE; |
const XML_Char *name; |
const ELEMENT_TYPE *oldE = (ELEMENT_TYPE *)hashTableIterNext(&iter); |
if (!oldE) |
break; |
name = poolCopyString(&(newDtd->pool), oldE->name); |
if (!name) |
return 0; |
newE = (ELEMENT_TYPE *)lookup(oldParser, &(newDtd->elementTypes), name, |
sizeof(ELEMENT_TYPE)); |
if (!newE) |
return 0; |
if (oldE->nDefaultAtts) { |
newE->defaultAtts = (DEFAULT_ATTRIBUTE *) |
ms->malloc_fcn(oldE->nDefaultAtts * sizeof(DEFAULT_ATTRIBUTE)); |
if (!newE->defaultAtts) { |
ms->free_fcn(newE); |
return 0; |
} |
} |
if (oldE->idAtt) |
newE->idAtt = (ATTRIBUTE_ID *) |
lookup(oldParser, &(newDtd->attributeIds), oldE->idAtt->name, 0); |
newE->allocDefaultAtts = newE->nDefaultAtts = oldE->nDefaultAtts; |
if (oldE->prefix) |
newE->prefix = (PREFIX *)lookup(oldParser, &(newDtd->prefixes), |
oldE->prefix->name, 0); |
for (i = 0; i < newE->nDefaultAtts; i++) { |
newE->defaultAtts[i].id = (ATTRIBUTE_ID *) |
lookup(oldParser, &(newDtd->attributeIds), oldE->defaultAtts[i].id->name, 0); |
newE->defaultAtts[i].isCdata = oldE->defaultAtts[i].isCdata; |
if (oldE->defaultAtts[i].value) { |
newE->defaultAtts[i].value |
= poolCopyString(&(newDtd->pool), oldE->defaultAtts[i].value); |
if (!newE->defaultAtts[i].value) |
return 0; |
} |
else |
newE->defaultAtts[i].value = NULL; |
} |
} |
/* Copy the entity tables. */ |
if (!copyEntityTable(oldParser, |
&(newDtd->generalEntities), |
&(newDtd->pool), |
&(oldDtd->generalEntities))) |
return 0; |
#ifdef XML_DTD |
if (!copyEntityTable(oldParser, |
&(newDtd->paramEntities), |
&(newDtd->pool), |
&(oldDtd->paramEntities))) |
return 0; |
newDtd->paramEntityRead = oldDtd->paramEntityRead; |
#endif /* XML_DTD */ |
newDtd->keepProcessing = oldDtd->keepProcessing; |
newDtd->hasParamEntityRefs = oldDtd->hasParamEntityRefs; |
newDtd->standalone = oldDtd->standalone; |
/* Don't want deep copying for scaffolding */ |
newDtd->in_eldecl = oldDtd->in_eldecl; |
newDtd->scaffold = oldDtd->scaffold; |
newDtd->contentStringLen = oldDtd->contentStringLen; |
newDtd->scaffSize = oldDtd->scaffSize; |
newDtd->scaffLevel = oldDtd->scaffLevel; |
newDtd->scaffIndex = oldDtd->scaffIndex; |
return 1; |
} /* End dtdCopy */ |
static int |
copyEntityTable(XML_Parser oldParser, |
HASH_TABLE *newTable, |
STRING_POOL *newPool, |
const HASH_TABLE *oldTable) |
{ |
HASH_TABLE_ITER iter; |
const XML_Char *cachedOldBase = NULL; |
const XML_Char *cachedNewBase = NULL; |
hashTableIterInit(&iter, oldTable); |
for (;;) { |
ENTITY *newE; |
const XML_Char *name; |
const ENTITY *oldE = (ENTITY *)hashTableIterNext(&iter); |
if (!oldE) |
break; |
name = poolCopyString(newPool, oldE->name); |
if (!name) |
return 0; |
newE = (ENTITY *)lookup(oldParser, newTable, name, sizeof(ENTITY)); |
if (!newE) |
return 0; |
if (oldE->systemId) { |
const XML_Char *tem = poolCopyString(newPool, oldE->systemId); |
if (!tem) |
return 0; |
newE->systemId = tem; |
if (oldE->base) { |
if (oldE->base == cachedOldBase) |
newE->base = cachedNewBase; |
else { |
cachedOldBase = oldE->base; |
tem = poolCopyString(newPool, cachedOldBase); |
if (!tem) |
return 0; |
cachedNewBase = newE->base = tem; |
} |
} |
if (oldE->publicId) { |
tem = poolCopyString(newPool, oldE->publicId); |
if (!tem) |
return 0; |
newE->publicId = tem; |
} |
} |
else { |
const XML_Char *tem = poolCopyStringN(newPool, oldE->textPtr, |
oldE->textLen); |
if (!tem) |
return 0; |
newE->textPtr = tem; |
newE->textLen = oldE->textLen; |
} |
if (oldE->notation) { |
const XML_Char *tem = poolCopyString(newPool, oldE->notation); |
if (!tem) |
return 0; |
newE->notation = tem; |
} |
newE->is_param = oldE->is_param; |
newE->is_internal = oldE->is_internal; |
} |
return 1; |
} |
#define INIT_POWER 6 |
static XML_Bool FASTCALL |
keyeq(KEY s1, KEY s2) |
{ |
for (; *s1 == *s2; s1++, s2++) |
if (*s1 == 0) |
return XML_TRUE; |
return XML_FALSE; |
} |
static unsigned long FASTCALL |
hash(XML_Parser parser, KEY s) |
{ |
unsigned long h = hash_secret_salt; |
while (*s) |
h = CHAR_HASH(h, *s++); |
return h; |
} |
static NAMED * |
lookup(XML_Parser parser, HASH_TABLE *table, KEY name, size_t createSize) |
{ |
size_t i; |
if (table->size == 0) { |
size_t tsize; |
if (!createSize) |
return NULL; |
table->power = INIT_POWER; |
/* table->size is a power of 2 */ |
table->size = (size_t)1 << INIT_POWER; |
tsize = table->size * sizeof(NAMED *); |
table->v = (NAMED **)table->mem->malloc_fcn(tsize); |
if (!table->v) { |
table->size = 0; |
return NULL; |
} |
memset(table->v, 0, tsize); |
i = hash(parser, name) & ((unsigned long)table->size - 1); |
} |
else { |
unsigned long h = hash(parser, name); |
unsigned long mask = (unsigned long)table->size - 1; |
unsigned char step = 0; |
i = h & mask; |
while (table->v[i]) { |
if (keyeq(name, table->v[i]->name)) |
return table->v[i]; |
if (!step) |
step = PROBE_STEP(h, mask, table->power); |
i < step ? (i += table->size - step) : (i -= step); |
} |
if (!createSize) |
return NULL; |
/* check for overflow (table is half full) */ |
if (table->used >> (table->power - 1)) { |
unsigned char newPower = table->power + 1; |
size_t newSize = (size_t)1 << newPower; |
unsigned long newMask = (unsigned long)newSize - 1; |
size_t tsize = newSize * sizeof(NAMED *); |
NAMED **newV = (NAMED **)table->mem->malloc_fcn(tsize); |
if (!newV) |
return NULL; |
memset(newV, 0, tsize); |
for (i = 0; i < table->size; i++) |
if (table->v[i]) { |
unsigned long newHash = hash(parser, table->v[i]->name); |
size_t j = newHash & newMask; |
step = 0; |
while (newV[j]) { |
if (!step) |
step = PROBE_STEP(newHash, newMask, newPower); |
j < step ? (j += newSize - step) : (j -= step); |
} |
newV[j] = table->v[i]; |
} |
table->mem->free_fcn(table->v); |
table->v = newV; |
table->power = newPower; |
table->size = newSize; |
i = h & newMask; |
step = 0; |
while (table->v[i]) { |
if (!step) |
step = PROBE_STEP(h, newMask, newPower); |
i < step ? (i += newSize - step) : (i -= step); |
} |
} |
} |
table->v[i] = (NAMED *)table->mem->malloc_fcn(createSize); |
if (!table->v[i]) |
return NULL; |
memset(table->v[i], 0, createSize); |
table->v[i]->name = name; |
(table->used)++; |
return table->v[i]; |
} |
static void FASTCALL |
hashTableClear(HASH_TABLE *table) |
{ |
size_t i; |
for (i = 0; i < table->size; i++) { |
table->mem->free_fcn(table->v[i]); |
table->v[i] = NULL; |
} |
table->used = 0; |
} |
static void FASTCALL |
hashTableDestroy(HASH_TABLE *table) |
{ |
size_t i; |
for (i = 0; i < table->size; i++) |
table->mem->free_fcn(table->v[i]); |
table->mem->free_fcn(table->v); |
} |
static void FASTCALL |
hashTableInit(HASH_TABLE *p, const XML_Memory_Handling_Suite *ms) |
{ |
p->power = 0; |
p->size = 0; |
p->used = 0; |
p->v = NULL; |
p->mem = ms; |
} |
static void FASTCALL |
hashTableIterInit(HASH_TABLE_ITER *iter, const HASH_TABLE *table) |
{ |
iter->p = table->v; |
iter->end = iter->p + table->size; |
} |
static NAMED * FASTCALL |
hashTableIterNext(HASH_TABLE_ITER *iter) |
{ |
while (iter->p != iter->end) { |
NAMED *tem = *(iter->p)++; |
if (tem) |
return tem; |
} |
return NULL; |
} |
static void FASTCALL |
poolInit(STRING_POOL *pool, const XML_Memory_Handling_Suite *ms) |
{ |
pool->blocks = NULL; |
pool->freeBlocks = NULL; |
pool->start = NULL; |
pool->ptr = NULL; |
pool->end = NULL; |
pool->mem = ms; |
} |
static void FASTCALL |
poolClear(STRING_POOL *pool) |
{ |
if (!pool->freeBlocks) |
pool->freeBlocks = pool->blocks; |
else { |
BLOCK *p = pool->blocks; |
while (p) { |
BLOCK *tem = p->next; |
p->next = pool->freeBlocks; |
pool->freeBlocks = p; |
p = tem; |
} |
} |
pool->blocks = NULL; |
pool->start = NULL; |
pool->ptr = NULL; |
pool->end = NULL; |
} |
static void FASTCALL |
poolDestroy(STRING_POOL *pool) |
{ |
BLOCK *p = pool->blocks; |
while (p) { |
BLOCK *tem = p->next; |
pool->mem->free_fcn(p); |
p = tem; |
} |
p = pool->freeBlocks; |
while (p) { |
BLOCK *tem = p->next; |
pool->mem->free_fcn(p); |
p = tem; |
} |
} |
static XML_Char * |
poolAppend(STRING_POOL *pool, const ENCODING *enc, |
const char *ptr, const char *end) |
{ |
if (!pool->ptr && !poolGrow(pool)) |
return NULL; |
for (;;) { |
XmlConvert(enc, &ptr, end, (ICHAR **)&(pool->ptr), (ICHAR *)pool->end); |
if (ptr == end) |
break; |
if (!poolGrow(pool)) |
return NULL; |
} |
return pool->start; |
} |
static const XML_Char * FASTCALL |
poolCopyString(STRING_POOL *pool, const XML_Char *s) |
{ |
do { |
if (!poolAppendChar(pool, *s)) |
return NULL; |
} while (*s++); |
s = pool->start; |
poolFinish(pool); |
return s; |
} |
static const XML_Char * |
poolCopyStringN(STRING_POOL *pool, const XML_Char *s, int n) |
{ |
if (!pool->ptr && !poolGrow(pool)) |
return NULL; |
for (; n > 0; --n, s++) { |
if (!poolAppendChar(pool, *s)) |
return NULL; |
} |
s = pool->start; |
poolFinish(pool); |
return s; |
} |
static const XML_Char * FASTCALL |
poolAppendString(STRING_POOL *pool, const XML_Char *s) |
{ |
while (*s) { |
if (!poolAppendChar(pool, *s)) |
return NULL; |
s++; |
} |
return pool->start; |
} |
static XML_Char * |
poolStoreString(STRING_POOL *pool, const ENCODING *enc, |
const char *ptr, const char *end) |
{ |
if (!poolAppend(pool, enc, ptr, end)) |
return NULL; |
if (pool->ptr == pool->end && !poolGrow(pool)) |
return NULL; |
*(pool->ptr)++ = 0; |
return pool->start; |
} |
static XML_Bool FASTCALL |
poolGrow(STRING_POOL *pool) |
{ |
if (pool->freeBlocks) { |
if (pool->start == 0) { |
pool->blocks = pool->freeBlocks; |
pool->freeBlocks = pool->freeBlocks->next; |
pool->blocks->next = NULL; |
pool->start = pool->blocks->s; |
pool->end = pool->start + pool->blocks->size; |
pool->ptr = pool->start; |
return XML_TRUE; |
} |
if (pool->end - pool->start < pool->freeBlocks->size) { |
BLOCK *tem = pool->freeBlocks->next; |
pool->freeBlocks->next = pool->blocks; |
pool->blocks = pool->freeBlocks; |
pool->freeBlocks = tem; |
memcpy(pool->blocks->s, pool->start, |
(pool->end - pool->start) * sizeof(XML_Char)); |
pool->ptr = pool->blocks->s + (pool->ptr - pool->start); |
pool->start = pool->blocks->s; |
pool->end = pool->start + pool->blocks->size; |
return XML_TRUE; |
} |
} |
if (pool->blocks && pool->start == pool->blocks->s) { |
int blockSize = (int)(pool->end - pool->start)*2; |
BLOCK *temp = (BLOCK *) |
pool->mem->realloc_fcn(pool->blocks, |
(offsetof(BLOCK, s) |
+ blockSize * sizeof(XML_Char))); |
if (temp == NULL) |
return XML_FALSE; |
pool->blocks = temp; |
pool->blocks->size = blockSize; |
pool->ptr = pool->blocks->s + (pool->ptr - pool->start); |
pool->start = pool->blocks->s; |
pool->end = pool->start + blockSize; |
} |
else { |
BLOCK *tem; |
int blockSize = (int)(pool->end - pool->start); |
if (blockSize < INIT_BLOCK_SIZE) |
blockSize = INIT_BLOCK_SIZE; |
else |
blockSize *= 2; |
tem = (BLOCK *)pool->mem->malloc_fcn(offsetof(BLOCK, s) |
+ blockSize * sizeof(XML_Char)); |
if (!tem) |
return XML_FALSE; |
tem->size = blockSize; |
tem->next = pool->blocks; |
pool->blocks = tem; |
if (pool->ptr != pool->start) |
memcpy(tem->s, pool->start, |
(pool->ptr - pool->start) * sizeof(XML_Char)); |
pool->ptr = tem->s + (pool->ptr - pool->start); |
pool->start = tem->s; |
pool->end = tem->s + blockSize; |
} |
return XML_TRUE; |
} |
static int FASTCALL |
nextScaffoldPart(XML_Parser parser) |
{ |
DTD * const dtd = _dtd; /* save one level of indirection */ |
CONTENT_SCAFFOLD * me; |
int next; |
if (!dtd->scaffIndex) { |
dtd->scaffIndex = (int *)MALLOC(groupSize * sizeof(int)); |
if (!dtd->scaffIndex) |
return -1; |
dtd->scaffIndex[0] = 0; |
} |
if (dtd->scaffCount >= dtd->scaffSize) { |
CONTENT_SCAFFOLD *temp; |
if (dtd->scaffold) { |
temp = (CONTENT_SCAFFOLD *) |
REALLOC(dtd->scaffold, dtd->scaffSize * 2 * sizeof(CONTENT_SCAFFOLD)); |
if (temp == NULL) |
return -1; |
dtd->scaffSize *= 2; |
} |
else { |
temp = (CONTENT_SCAFFOLD *)MALLOC(INIT_SCAFFOLD_ELEMENTS |
* sizeof(CONTENT_SCAFFOLD)); |
if (temp == NULL) |
return -1; |
dtd->scaffSize = INIT_SCAFFOLD_ELEMENTS; |
} |
dtd->scaffold = temp; |
} |
next = dtd->scaffCount++; |
me = &dtd->scaffold[next]; |
if (dtd->scaffLevel) { |
CONTENT_SCAFFOLD *parent = &dtd->scaffold[dtd->scaffIndex[dtd->scaffLevel-1]]; |
if (parent->lastchild) { |
dtd->scaffold[parent->lastchild].nextsib = next; |
} |
if (!parent->childcnt) |
parent->firstchild = next; |
parent->lastchild = next; |
parent->childcnt++; |
} |
me->firstchild = me->lastchild = me->childcnt = me->nextsib = 0; |
return next; |
} |
static void |
build_node(XML_Parser parser, |
int src_node, |
XML_Content *dest, |
XML_Content **contpos, |
XML_Char **strpos) |
{ |
DTD * const dtd = _dtd; /* save one level of indirection */ |
dest->type = dtd->scaffold[src_node].type; |
dest->quant = dtd->scaffold[src_node].quant; |
if (dest->type == XML_CTYPE_NAME) { |
const XML_Char *src; |
dest->name = *strpos; |
src = dtd->scaffold[src_node].name; |
for (;;) { |
*(*strpos)++ = *src; |
if (!*src) |
break; |
src++; |
} |
dest->numchildren = 0; |
dest->children = NULL; |
} |
else { |
unsigned int i; |
int cn; |
dest->numchildren = dtd->scaffold[src_node].childcnt; |
dest->children = *contpos; |
*contpos += dest->numchildren; |
for (i = 0, cn = dtd->scaffold[src_node].firstchild; |
i < dest->numchildren; |
i++, cn = dtd->scaffold[cn].nextsib) { |
build_node(parser, cn, &(dest->children[i]), contpos, strpos); |
} |
dest->name = NULL; |
} |
} |
static XML_Content * |
build_model (XML_Parser parser) |
{ |
DTD * const dtd = _dtd; /* save one level of indirection */ |
XML_Content *ret; |
XML_Content *cpos; |
XML_Char * str; |
int allocsize = (dtd->scaffCount * sizeof(XML_Content) |
+ (dtd->contentStringLen * sizeof(XML_Char))); |
ret = (XML_Content *)MALLOC(allocsize); |
if (!ret) |
return NULL; |
str = (XML_Char *) (&ret[dtd->scaffCount]); |
cpos = &ret[1]; |
build_node(parser, 0, ret, &cpos, &str); |
return ret; |
} |
static ELEMENT_TYPE * |
getElementType(XML_Parser parser, |
const ENCODING *enc, |
const char *ptr, |
const char *end) |
{ |
DTD * const dtd = _dtd; /* save one level of indirection */ |
const XML_Char *name = poolStoreString(&dtd->pool, enc, ptr, end); |
ELEMENT_TYPE *ret; |
if (!name) |
return NULL; |
ret = (ELEMENT_TYPE *) lookup(parser, &dtd->elementTypes, name, sizeof(ELEMENT_TYPE)); |
if (!ret) |
return NULL; |
if (ret->name != name) |
poolDiscard(&dtd->pool); |
else { |
poolFinish(&dtd->pool); |
if (!setElementTypePrefix(parser, ret)) |
return NULL; |
} |
return ret; |
} |
/contrib/sdk/sources/expat/lib/xmlrole.c |
---|
0,0 → 1,1336 |
/* Copyright (c) 1998, 1999 Thai Open Source Software Center Ltd |
See the file COPYING for copying permission. |
*/ |
#include <stddef.h> |
#ifdef COMPILED_FROM_DSP |
#include "winconfig.h" |
#elif defined(MACOS_CLASSIC) |
#include "macconfig.h" |
#elif defined(__amigaos__) |
#include "amigaconfig.h" |
#elif defined(__WATCOMC__) |
#include "watcomconfig.h" |
#else |
#ifdef HAVE_EXPAT_CONFIG_H |
#include <expat_config.h> |
#endif |
#endif /* ndef COMPILED_FROM_DSP */ |
#include "expat_external.h" |
#include "internal.h" |
#include "xmlrole.h" |
#include "ascii.h" |
/* Doesn't check: |
that ,| are not mixed in a model group |
content of literals |
*/ |
static const char KW_ANY[] = { |
ASCII_A, ASCII_N, ASCII_Y, '\0' }; |
static const char KW_ATTLIST[] = { |
ASCII_A, ASCII_T, ASCII_T, ASCII_L, ASCII_I, ASCII_S, ASCII_T, '\0' }; |
static const char KW_CDATA[] = { |
ASCII_C, ASCII_D, ASCII_A, ASCII_T, ASCII_A, '\0' }; |
static const char KW_DOCTYPE[] = { |
ASCII_D, ASCII_O, ASCII_C, ASCII_T, ASCII_Y, ASCII_P, ASCII_E, '\0' }; |
static const char KW_ELEMENT[] = { |
ASCII_E, ASCII_L, ASCII_E, ASCII_M, ASCII_E, ASCII_N, ASCII_T, '\0' }; |
static const char KW_EMPTY[] = { |
ASCII_E, ASCII_M, ASCII_P, ASCII_T, ASCII_Y, '\0' }; |
static const char KW_ENTITIES[] = { |
ASCII_E, ASCII_N, ASCII_T, ASCII_I, ASCII_T, ASCII_I, ASCII_E, ASCII_S, |
'\0' }; |
static const char KW_ENTITY[] = { |
ASCII_E, ASCII_N, ASCII_T, ASCII_I, ASCII_T, ASCII_Y, '\0' }; |
static const char KW_FIXED[] = { |
ASCII_F, ASCII_I, ASCII_X, ASCII_E, ASCII_D, '\0' }; |
static const char KW_ID[] = { |
ASCII_I, ASCII_D, '\0' }; |
static const char KW_IDREF[] = { |
ASCII_I, ASCII_D, ASCII_R, ASCII_E, ASCII_F, '\0' }; |
static const char KW_IDREFS[] = { |
ASCII_I, ASCII_D, ASCII_R, ASCII_E, ASCII_F, ASCII_S, '\0' }; |
#ifdef XML_DTD |
static const char KW_IGNORE[] = { |
ASCII_I, ASCII_G, ASCII_N, ASCII_O, ASCII_R, ASCII_E, '\0' }; |
#endif |
static const char KW_IMPLIED[] = { |
ASCII_I, ASCII_M, ASCII_P, ASCII_L, ASCII_I, ASCII_E, ASCII_D, '\0' }; |
#ifdef XML_DTD |
static const char KW_INCLUDE[] = { |
ASCII_I, ASCII_N, ASCII_C, ASCII_L, ASCII_U, ASCII_D, ASCII_E, '\0' }; |
#endif |
static const char KW_NDATA[] = { |
ASCII_N, ASCII_D, ASCII_A, ASCII_T, ASCII_A, '\0' }; |
static const char KW_NMTOKEN[] = { |
ASCII_N, ASCII_M, ASCII_T, ASCII_O, ASCII_K, ASCII_E, ASCII_N, '\0' }; |
static const char KW_NMTOKENS[] = { |
ASCII_N, ASCII_M, ASCII_T, ASCII_O, ASCII_K, ASCII_E, ASCII_N, ASCII_S, |
'\0' }; |
static const char KW_NOTATION[] = |
{ ASCII_N, ASCII_O, ASCII_T, ASCII_A, ASCII_T, ASCII_I, ASCII_O, ASCII_N, |
'\0' }; |
static const char KW_PCDATA[] = { |
ASCII_P, ASCII_C, ASCII_D, ASCII_A, ASCII_T, ASCII_A, '\0' }; |
static const char KW_PUBLIC[] = { |
ASCII_P, ASCII_U, ASCII_B, ASCII_L, ASCII_I, ASCII_C, '\0' }; |
static const char KW_REQUIRED[] = { |
ASCII_R, ASCII_E, ASCII_Q, ASCII_U, ASCII_I, ASCII_R, ASCII_E, ASCII_D, |
'\0' }; |
static const char KW_SYSTEM[] = { |
ASCII_S, ASCII_Y, ASCII_S, ASCII_T, ASCII_E, ASCII_M, '\0' }; |
#ifndef MIN_BYTES_PER_CHAR |
#define MIN_BYTES_PER_CHAR(enc) ((enc)->minBytesPerChar) |
#endif |
#ifdef XML_DTD |
#define setTopLevel(state) \ |
((state)->handler = ((state)->documentEntity \ |
? internalSubset \ |
: externalSubset1)) |
#else /* not XML_DTD */ |
#define setTopLevel(state) ((state)->handler = internalSubset) |
#endif /* not XML_DTD */ |
typedef int PTRCALL PROLOG_HANDLER(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc); |
static PROLOG_HANDLER |
prolog0, prolog1, prolog2, |
doctype0, doctype1, doctype2, doctype3, doctype4, doctype5, |
internalSubset, |
entity0, entity1, entity2, entity3, entity4, entity5, entity6, |
entity7, entity8, entity9, entity10, |
notation0, notation1, notation2, notation3, notation4, |
attlist0, attlist1, attlist2, attlist3, attlist4, attlist5, attlist6, |
attlist7, attlist8, attlist9, |
element0, element1, element2, element3, element4, element5, element6, |
element7, |
#ifdef XML_DTD |
externalSubset0, externalSubset1, |
condSect0, condSect1, condSect2, |
#endif /* XML_DTD */ |
declClose, |
error; |
static int FASTCALL common(PROLOG_STATE *state, int tok); |
static int PTRCALL |
prolog0(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
state->handler = prolog1; |
return XML_ROLE_NONE; |
case XML_TOK_XML_DECL: |
state->handler = prolog1; |
return XML_ROLE_XML_DECL; |
case XML_TOK_PI: |
state->handler = prolog1; |
return XML_ROLE_PI; |
case XML_TOK_COMMENT: |
state->handler = prolog1; |
return XML_ROLE_COMMENT; |
case XML_TOK_BOM: |
return XML_ROLE_NONE; |
case XML_TOK_DECL_OPEN: |
if (!XmlNameMatchesAscii(enc, |
ptr + 2 * MIN_BYTES_PER_CHAR(enc), |
end, |
KW_DOCTYPE)) |
break; |
state->handler = doctype0; |
return XML_ROLE_DOCTYPE_NONE; |
case XML_TOK_INSTANCE_START: |
state->handler = error; |
return XML_ROLE_INSTANCE_START; |
} |
return common(state, tok); |
} |
static int PTRCALL |
prolog1(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_NONE; |
case XML_TOK_PI: |
return XML_ROLE_PI; |
case XML_TOK_COMMENT: |
return XML_ROLE_COMMENT; |
case XML_TOK_BOM: |
return XML_ROLE_NONE; |
case XML_TOK_DECL_OPEN: |
if (!XmlNameMatchesAscii(enc, |
ptr + 2 * MIN_BYTES_PER_CHAR(enc), |
end, |
KW_DOCTYPE)) |
break; |
state->handler = doctype0; |
return XML_ROLE_DOCTYPE_NONE; |
case XML_TOK_INSTANCE_START: |
state->handler = error; |
return XML_ROLE_INSTANCE_START; |
} |
return common(state, tok); |
} |
static int PTRCALL |
prolog2(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_NONE; |
case XML_TOK_PI: |
return XML_ROLE_PI; |
case XML_TOK_COMMENT: |
return XML_ROLE_COMMENT; |
case XML_TOK_INSTANCE_START: |
state->handler = error; |
return XML_ROLE_INSTANCE_START; |
} |
return common(state, tok); |
} |
static int PTRCALL |
doctype0(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_DOCTYPE_NONE; |
case XML_TOK_NAME: |
case XML_TOK_PREFIXED_NAME: |
state->handler = doctype1; |
return XML_ROLE_DOCTYPE_NAME; |
} |
return common(state, tok); |
} |
static int PTRCALL |
doctype1(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_DOCTYPE_NONE; |
case XML_TOK_OPEN_BRACKET: |
state->handler = internalSubset; |
return XML_ROLE_DOCTYPE_INTERNAL_SUBSET; |
case XML_TOK_DECL_CLOSE: |
state->handler = prolog2; |
return XML_ROLE_DOCTYPE_CLOSE; |
case XML_TOK_NAME: |
if (XmlNameMatchesAscii(enc, ptr, end, KW_SYSTEM)) { |
state->handler = doctype3; |
return XML_ROLE_DOCTYPE_NONE; |
} |
if (XmlNameMatchesAscii(enc, ptr, end, KW_PUBLIC)) { |
state->handler = doctype2; |
return XML_ROLE_DOCTYPE_NONE; |
} |
break; |
} |
return common(state, tok); |
} |
static int PTRCALL |
doctype2(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_DOCTYPE_NONE; |
case XML_TOK_LITERAL: |
state->handler = doctype3; |
return XML_ROLE_DOCTYPE_PUBLIC_ID; |
} |
return common(state, tok); |
} |
static int PTRCALL |
doctype3(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_DOCTYPE_NONE; |
case XML_TOK_LITERAL: |
state->handler = doctype4; |
return XML_ROLE_DOCTYPE_SYSTEM_ID; |
} |
return common(state, tok); |
} |
static int PTRCALL |
doctype4(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_DOCTYPE_NONE; |
case XML_TOK_OPEN_BRACKET: |
state->handler = internalSubset; |
return XML_ROLE_DOCTYPE_INTERNAL_SUBSET; |
case XML_TOK_DECL_CLOSE: |
state->handler = prolog2; |
return XML_ROLE_DOCTYPE_CLOSE; |
} |
return common(state, tok); |
} |
static int PTRCALL |
doctype5(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_DOCTYPE_NONE; |
case XML_TOK_DECL_CLOSE: |
state->handler = prolog2; |
return XML_ROLE_DOCTYPE_CLOSE; |
} |
return common(state, tok); |
} |
static int PTRCALL |
internalSubset(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_NONE; |
case XML_TOK_DECL_OPEN: |
if (XmlNameMatchesAscii(enc, |
ptr + 2 * MIN_BYTES_PER_CHAR(enc), |
end, |
KW_ENTITY)) { |
state->handler = entity0; |
return XML_ROLE_ENTITY_NONE; |
} |
if (XmlNameMatchesAscii(enc, |
ptr + 2 * MIN_BYTES_PER_CHAR(enc), |
end, |
KW_ATTLIST)) { |
state->handler = attlist0; |
return XML_ROLE_ATTLIST_NONE; |
} |
if (XmlNameMatchesAscii(enc, |
ptr + 2 * MIN_BYTES_PER_CHAR(enc), |
end, |
KW_ELEMENT)) { |
state->handler = element0; |
return XML_ROLE_ELEMENT_NONE; |
} |
if (XmlNameMatchesAscii(enc, |
ptr + 2 * MIN_BYTES_PER_CHAR(enc), |
end, |
KW_NOTATION)) { |
state->handler = notation0; |
return XML_ROLE_NOTATION_NONE; |
} |
break; |
case XML_TOK_PI: |
return XML_ROLE_PI; |
case XML_TOK_COMMENT: |
return XML_ROLE_COMMENT; |
case XML_TOK_PARAM_ENTITY_REF: |
return XML_ROLE_PARAM_ENTITY_REF; |
case XML_TOK_CLOSE_BRACKET: |
state->handler = doctype5; |
return XML_ROLE_DOCTYPE_NONE; |
case XML_TOK_NONE: |
return XML_ROLE_NONE; |
} |
return common(state, tok); |
} |
#ifdef XML_DTD |
static int PTRCALL |
externalSubset0(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
state->handler = externalSubset1; |
if (tok == XML_TOK_XML_DECL) |
return XML_ROLE_TEXT_DECL; |
return externalSubset1(state, tok, ptr, end, enc); |
} |
static int PTRCALL |
externalSubset1(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_COND_SECT_OPEN: |
state->handler = condSect0; |
return XML_ROLE_NONE; |
case XML_TOK_COND_SECT_CLOSE: |
if (state->includeLevel == 0) |
break; |
state->includeLevel -= 1; |
return XML_ROLE_NONE; |
case XML_TOK_PROLOG_S: |
return XML_ROLE_NONE; |
case XML_TOK_CLOSE_BRACKET: |
break; |
case XML_TOK_NONE: |
if (state->includeLevel) |
break; |
return XML_ROLE_NONE; |
default: |
return internalSubset(state, tok, ptr, end, enc); |
} |
return common(state, tok); |
} |
#endif /* XML_DTD */ |
static int PTRCALL |
entity0(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ENTITY_NONE; |
case XML_TOK_PERCENT: |
state->handler = entity1; |
return XML_ROLE_ENTITY_NONE; |
case XML_TOK_NAME: |
state->handler = entity2; |
return XML_ROLE_GENERAL_ENTITY_NAME; |
} |
return common(state, tok); |
} |
static int PTRCALL |
entity1(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ENTITY_NONE; |
case XML_TOK_NAME: |
state->handler = entity7; |
return XML_ROLE_PARAM_ENTITY_NAME; |
} |
return common(state, tok); |
} |
static int PTRCALL |
entity2(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ENTITY_NONE; |
case XML_TOK_NAME: |
if (XmlNameMatchesAscii(enc, ptr, end, KW_SYSTEM)) { |
state->handler = entity4; |
return XML_ROLE_ENTITY_NONE; |
} |
if (XmlNameMatchesAscii(enc, ptr, end, KW_PUBLIC)) { |
state->handler = entity3; |
return XML_ROLE_ENTITY_NONE; |
} |
break; |
case XML_TOK_LITERAL: |
state->handler = declClose; |
state->role_none = XML_ROLE_ENTITY_NONE; |
return XML_ROLE_ENTITY_VALUE; |
} |
return common(state, tok); |
} |
static int PTRCALL |
entity3(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ENTITY_NONE; |
case XML_TOK_LITERAL: |
state->handler = entity4; |
return XML_ROLE_ENTITY_PUBLIC_ID; |
} |
return common(state, tok); |
} |
static int PTRCALL |
entity4(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ENTITY_NONE; |
case XML_TOK_LITERAL: |
state->handler = entity5; |
return XML_ROLE_ENTITY_SYSTEM_ID; |
} |
return common(state, tok); |
} |
static int PTRCALL |
entity5(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ENTITY_NONE; |
case XML_TOK_DECL_CLOSE: |
setTopLevel(state); |
return XML_ROLE_ENTITY_COMPLETE; |
case XML_TOK_NAME: |
if (XmlNameMatchesAscii(enc, ptr, end, KW_NDATA)) { |
state->handler = entity6; |
return XML_ROLE_ENTITY_NONE; |
} |
break; |
} |
return common(state, tok); |
} |
static int PTRCALL |
entity6(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ENTITY_NONE; |
case XML_TOK_NAME: |
state->handler = declClose; |
state->role_none = XML_ROLE_ENTITY_NONE; |
return XML_ROLE_ENTITY_NOTATION_NAME; |
} |
return common(state, tok); |
} |
static int PTRCALL |
entity7(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ENTITY_NONE; |
case XML_TOK_NAME: |
if (XmlNameMatchesAscii(enc, ptr, end, KW_SYSTEM)) { |
state->handler = entity9; |
return XML_ROLE_ENTITY_NONE; |
} |
if (XmlNameMatchesAscii(enc, ptr, end, KW_PUBLIC)) { |
state->handler = entity8; |
return XML_ROLE_ENTITY_NONE; |
} |
break; |
case XML_TOK_LITERAL: |
state->handler = declClose; |
state->role_none = XML_ROLE_ENTITY_NONE; |
return XML_ROLE_ENTITY_VALUE; |
} |
return common(state, tok); |
} |
static int PTRCALL |
entity8(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ENTITY_NONE; |
case XML_TOK_LITERAL: |
state->handler = entity9; |
return XML_ROLE_ENTITY_PUBLIC_ID; |
} |
return common(state, tok); |
} |
static int PTRCALL |
entity9(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ENTITY_NONE; |
case XML_TOK_LITERAL: |
state->handler = entity10; |
return XML_ROLE_ENTITY_SYSTEM_ID; |
} |
return common(state, tok); |
} |
static int PTRCALL |
entity10(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ENTITY_NONE; |
case XML_TOK_DECL_CLOSE: |
setTopLevel(state); |
return XML_ROLE_ENTITY_COMPLETE; |
} |
return common(state, tok); |
} |
static int PTRCALL |
notation0(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_NOTATION_NONE; |
case XML_TOK_NAME: |
state->handler = notation1; |
return XML_ROLE_NOTATION_NAME; |
} |
return common(state, tok); |
} |
static int PTRCALL |
notation1(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_NOTATION_NONE; |
case XML_TOK_NAME: |
if (XmlNameMatchesAscii(enc, ptr, end, KW_SYSTEM)) { |
state->handler = notation3; |
return XML_ROLE_NOTATION_NONE; |
} |
if (XmlNameMatchesAscii(enc, ptr, end, KW_PUBLIC)) { |
state->handler = notation2; |
return XML_ROLE_NOTATION_NONE; |
} |
break; |
} |
return common(state, tok); |
} |
static int PTRCALL |
notation2(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_NOTATION_NONE; |
case XML_TOK_LITERAL: |
state->handler = notation4; |
return XML_ROLE_NOTATION_PUBLIC_ID; |
} |
return common(state, tok); |
} |
static int PTRCALL |
notation3(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_NOTATION_NONE; |
case XML_TOK_LITERAL: |
state->handler = declClose; |
state->role_none = XML_ROLE_NOTATION_NONE; |
return XML_ROLE_NOTATION_SYSTEM_ID; |
} |
return common(state, tok); |
} |
static int PTRCALL |
notation4(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_NOTATION_NONE; |
case XML_TOK_LITERAL: |
state->handler = declClose; |
state->role_none = XML_ROLE_NOTATION_NONE; |
return XML_ROLE_NOTATION_SYSTEM_ID; |
case XML_TOK_DECL_CLOSE: |
setTopLevel(state); |
return XML_ROLE_NOTATION_NO_SYSTEM_ID; |
} |
return common(state, tok); |
} |
static int PTRCALL |
attlist0(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ATTLIST_NONE; |
case XML_TOK_NAME: |
case XML_TOK_PREFIXED_NAME: |
state->handler = attlist1; |
return XML_ROLE_ATTLIST_ELEMENT_NAME; |
} |
return common(state, tok); |
} |
static int PTRCALL |
attlist1(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ATTLIST_NONE; |
case XML_TOK_DECL_CLOSE: |
setTopLevel(state); |
return XML_ROLE_ATTLIST_NONE; |
case XML_TOK_NAME: |
case XML_TOK_PREFIXED_NAME: |
state->handler = attlist2; |
return XML_ROLE_ATTRIBUTE_NAME; |
} |
return common(state, tok); |
} |
static int PTRCALL |
attlist2(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ATTLIST_NONE; |
case XML_TOK_NAME: |
{ |
static const char * const types[] = { |
KW_CDATA, |
KW_ID, |
KW_IDREF, |
KW_IDREFS, |
KW_ENTITY, |
KW_ENTITIES, |
KW_NMTOKEN, |
KW_NMTOKENS, |
}; |
int i; |
for (i = 0; i < (int)(sizeof(types)/sizeof(types[0])); i++) |
if (XmlNameMatchesAscii(enc, ptr, end, types[i])) { |
state->handler = attlist8; |
return XML_ROLE_ATTRIBUTE_TYPE_CDATA + i; |
} |
} |
if (XmlNameMatchesAscii(enc, ptr, end, KW_NOTATION)) { |
state->handler = attlist5; |
return XML_ROLE_ATTLIST_NONE; |
} |
break; |
case XML_TOK_OPEN_PAREN: |
state->handler = attlist3; |
return XML_ROLE_ATTLIST_NONE; |
} |
return common(state, tok); |
} |
static int PTRCALL |
attlist3(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ATTLIST_NONE; |
case XML_TOK_NMTOKEN: |
case XML_TOK_NAME: |
case XML_TOK_PREFIXED_NAME: |
state->handler = attlist4; |
return XML_ROLE_ATTRIBUTE_ENUM_VALUE; |
} |
return common(state, tok); |
} |
static int PTRCALL |
attlist4(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ATTLIST_NONE; |
case XML_TOK_CLOSE_PAREN: |
state->handler = attlist8; |
return XML_ROLE_ATTLIST_NONE; |
case XML_TOK_OR: |
state->handler = attlist3; |
return XML_ROLE_ATTLIST_NONE; |
} |
return common(state, tok); |
} |
static int PTRCALL |
attlist5(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ATTLIST_NONE; |
case XML_TOK_OPEN_PAREN: |
state->handler = attlist6; |
return XML_ROLE_ATTLIST_NONE; |
} |
return common(state, tok); |
} |
static int PTRCALL |
attlist6(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ATTLIST_NONE; |
case XML_TOK_NAME: |
state->handler = attlist7; |
return XML_ROLE_ATTRIBUTE_NOTATION_VALUE; |
} |
return common(state, tok); |
} |
static int PTRCALL |
attlist7(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ATTLIST_NONE; |
case XML_TOK_CLOSE_PAREN: |
state->handler = attlist8; |
return XML_ROLE_ATTLIST_NONE; |
case XML_TOK_OR: |
state->handler = attlist6; |
return XML_ROLE_ATTLIST_NONE; |
} |
return common(state, tok); |
} |
/* default value */ |
static int PTRCALL |
attlist8(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ATTLIST_NONE; |
case XML_TOK_POUND_NAME: |
if (XmlNameMatchesAscii(enc, |
ptr + MIN_BYTES_PER_CHAR(enc), |
end, |
KW_IMPLIED)) { |
state->handler = attlist1; |
return XML_ROLE_IMPLIED_ATTRIBUTE_VALUE; |
} |
if (XmlNameMatchesAscii(enc, |
ptr + MIN_BYTES_PER_CHAR(enc), |
end, |
KW_REQUIRED)) { |
state->handler = attlist1; |
return XML_ROLE_REQUIRED_ATTRIBUTE_VALUE; |
} |
if (XmlNameMatchesAscii(enc, |
ptr + MIN_BYTES_PER_CHAR(enc), |
end, |
KW_FIXED)) { |
state->handler = attlist9; |
return XML_ROLE_ATTLIST_NONE; |
} |
break; |
case XML_TOK_LITERAL: |
state->handler = attlist1; |
return XML_ROLE_DEFAULT_ATTRIBUTE_VALUE; |
} |
return common(state, tok); |
} |
static int PTRCALL |
attlist9(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ATTLIST_NONE; |
case XML_TOK_LITERAL: |
state->handler = attlist1; |
return XML_ROLE_FIXED_ATTRIBUTE_VALUE; |
} |
return common(state, tok); |
} |
static int PTRCALL |
element0(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ELEMENT_NONE; |
case XML_TOK_NAME: |
case XML_TOK_PREFIXED_NAME: |
state->handler = element1; |
return XML_ROLE_ELEMENT_NAME; |
} |
return common(state, tok); |
} |
static int PTRCALL |
element1(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ELEMENT_NONE; |
case XML_TOK_NAME: |
if (XmlNameMatchesAscii(enc, ptr, end, KW_EMPTY)) { |
state->handler = declClose; |
state->role_none = XML_ROLE_ELEMENT_NONE; |
return XML_ROLE_CONTENT_EMPTY; |
} |
if (XmlNameMatchesAscii(enc, ptr, end, KW_ANY)) { |
state->handler = declClose; |
state->role_none = XML_ROLE_ELEMENT_NONE; |
return XML_ROLE_CONTENT_ANY; |
} |
break; |
case XML_TOK_OPEN_PAREN: |
state->handler = element2; |
state->level = 1; |
return XML_ROLE_GROUP_OPEN; |
} |
return common(state, tok); |
} |
static int PTRCALL |
element2(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ELEMENT_NONE; |
case XML_TOK_POUND_NAME: |
if (XmlNameMatchesAscii(enc, |
ptr + MIN_BYTES_PER_CHAR(enc), |
end, |
KW_PCDATA)) { |
state->handler = element3; |
return XML_ROLE_CONTENT_PCDATA; |
} |
break; |
case XML_TOK_OPEN_PAREN: |
state->level = 2; |
state->handler = element6; |
return XML_ROLE_GROUP_OPEN; |
case XML_TOK_NAME: |
case XML_TOK_PREFIXED_NAME: |
state->handler = element7; |
return XML_ROLE_CONTENT_ELEMENT; |
case XML_TOK_NAME_QUESTION: |
state->handler = element7; |
return XML_ROLE_CONTENT_ELEMENT_OPT; |
case XML_TOK_NAME_ASTERISK: |
state->handler = element7; |
return XML_ROLE_CONTENT_ELEMENT_REP; |
case XML_TOK_NAME_PLUS: |
state->handler = element7; |
return XML_ROLE_CONTENT_ELEMENT_PLUS; |
} |
return common(state, tok); |
} |
static int PTRCALL |
element3(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ELEMENT_NONE; |
case XML_TOK_CLOSE_PAREN: |
state->handler = declClose; |
state->role_none = XML_ROLE_ELEMENT_NONE; |
return XML_ROLE_GROUP_CLOSE; |
case XML_TOK_CLOSE_PAREN_ASTERISK: |
state->handler = declClose; |
state->role_none = XML_ROLE_ELEMENT_NONE; |
return XML_ROLE_GROUP_CLOSE_REP; |
case XML_TOK_OR: |
state->handler = element4; |
return XML_ROLE_ELEMENT_NONE; |
} |
return common(state, tok); |
} |
static int PTRCALL |
element4(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ELEMENT_NONE; |
case XML_TOK_NAME: |
case XML_TOK_PREFIXED_NAME: |
state->handler = element5; |
return XML_ROLE_CONTENT_ELEMENT; |
} |
return common(state, tok); |
} |
static int PTRCALL |
element5(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ELEMENT_NONE; |
case XML_TOK_CLOSE_PAREN_ASTERISK: |
state->handler = declClose; |
state->role_none = XML_ROLE_ELEMENT_NONE; |
return XML_ROLE_GROUP_CLOSE_REP; |
case XML_TOK_OR: |
state->handler = element4; |
return XML_ROLE_ELEMENT_NONE; |
} |
return common(state, tok); |
} |
static int PTRCALL |
element6(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ELEMENT_NONE; |
case XML_TOK_OPEN_PAREN: |
state->level += 1; |
return XML_ROLE_GROUP_OPEN; |
case XML_TOK_NAME: |
case XML_TOK_PREFIXED_NAME: |
state->handler = element7; |
return XML_ROLE_CONTENT_ELEMENT; |
case XML_TOK_NAME_QUESTION: |
state->handler = element7; |
return XML_ROLE_CONTENT_ELEMENT_OPT; |
case XML_TOK_NAME_ASTERISK: |
state->handler = element7; |
return XML_ROLE_CONTENT_ELEMENT_REP; |
case XML_TOK_NAME_PLUS: |
state->handler = element7; |
return XML_ROLE_CONTENT_ELEMENT_PLUS; |
} |
return common(state, tok); |
} |
static int PTRCALL |
element7(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_ELEMENT_NONE; |
case XML_TOK_CLOSE_PAREN: |
state->level -= 1; |
if (state->level == 0) { |
state->handler = declClose; |
state->role_none = XML_ROLE_ELEMENT_NONE; |
} |
return XML_ROLE_GROUP_CLOSE; |
case XML_TOK_CLOSE_PAREN_ASTERISK: |
state->level -= 1; |
if (state->level == 0) { |
state->handler = declClose; |
state->role_none = XML_ROLE_ELEMENT_NONE; |
} |
return XML_ROLE_GROUP_CLOSE_REP; |
case XML_TOK_CLOSE_PAREN_QUESTION: |
state->level -= 1; |
if (state->level == 0) { |
state->handler = declClose; |
state->role_none = XML_ROLE_ELEMENT_NONE; |
} |
return XML_ROLE_GROUP_CLOSE_OPT; |
case XML_TOK_CLOSE_PAREN_PLUS: |
state->level -= 1; |
if (state->level == 0) { |
state->handler = declClose; |
state->role_none = XML_ROLE_ELEMENT_NONE; |
} |
return XML_ROLE_GROUP_CLOSE_PLUS; |
case XML_TOK_COMMA: |
state->handler = element6; |
return XML_ROLE_GROUP_SEQUENCE; |
case XML_TOK_OR: |
state->handler = element6; |
return XML_ROLE_GROUP_CHOICE; |
} |
return common(state, tok); |
} |
#ifdef XML_DTD |
static int PTRCALL |
condSect0(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_NONE; |
case XML_TOK_NAME: |
if (XmlNameMatchesAscii(enc, ptr, end, KW_INCLUDE)) { |
state->handler = condSect1; |
return XML_ROLE_NONE; |
} |
if (XmlNameMatchesAscii(enc, ptr, end, KW_IGNORE)) { |
state->handler = condSect2; |
return XML_ROLE_NONE; |
} |
break; |
} |
return common(state, tok); |
} |
static int PTRCALL |
condSect1(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_NONE; |
case XML_TOK_OPEN_BRACKET: |
state->handler = externalSubset1; |
state->includeLevel += 1; |
return XML_ROLE_NONE; |
} |
return common(state, tok); |
} |
static int PTRCALL |
condSect2(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return XML_ROLE_NONE; |
case XML_TOK_OPEN_BRACKET: |
state->handler = externalSubset1; |
return XML_ROLE_IGNORE_SECT; |
} |
return common(state, tok); |
} |
#endif /* XML_DTD */ |
static int PTRCALL |
declClose(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
switch (tok) { |
case XML_TOK_PROLOG_S: |
return state->role_none; |
case XML_TOK_DECL_CLOSE: |
setTopLevel(state); |
return state->role_none; |
} |
return common(state, tok); |
} |
static int PTRCALL |
error(PROLOG_STATE *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc) |
{ |
return XML_ROLE_NONE; |
} |
static int FASTCALL |
common(PROLOG_STATE *state, int tok) |
{ |
#ifdef XML_DTD |
if (!state->documentEntity && tok == XML_TOK_PARAM_ENTITY_REF) |
return XML_ROLE_INNER_PARAM_ENTITY_REF; |
#endif |
state->handler = error; |
return XML_ROLE_ERROR; |
} |
void |
XmlPrologStateInit(PROLOG_STATE *state) |
{ |
state->handler = prolog0; |
#ifdef XML_DTD |
state->documentEntity = 1; |
state->includeLevel = 0; |
state->inEntityValue = 0; |
#endif /* XML_DTD */ |
} |
#ifdef XML_DTD |
void |
XmlPrologStateInitExternalEntity(PROLOG_STATE *state) |
{ |
state->handler = externalSubset0; |
state->documentEntity = 0; |
state->includeLevel = 0; |
} |
#endif /* XML_DTD */ |
/contrib/sdk/sources/expat/lib/xmlrole.h |
---|
0,0 → 1,114 |
/* Copyright (c) 1998, 1999 Thai Open Source Software Center Ltd |
See the file COPYING for copying permission. |
*/ |
#ifndef XmlRole_INCLUDED |
#define XmlRole_INCLUDED 1 |
#ifdef __VMS |
/* 0 1 2 3 0 1 2 3 |
1234567890123456789012345678901 1234567890123456789012345678901 */ |
#define XmlPrologStateInitExternalEntity XmlPrologStateInitExternalEnt |
#endif |
#include "xmltok.h" |
#ifdef __cplusplus |
extern "C" { |
#endif |
enum { |
XML_ROLE_ERROR = -1, |
XML_ROLE_NONE = 0, |
XML_ROLE_XML_DECL, |
XML_ROLE_INSTANCE_START, |
XML_ROLE_DOCTYPE_NONE, |
XML_ROLE_DOCTYPE_NAME, |
XML_ROLE_DOCTYPE_SYSTEM_ID, |
XML_ROLE_DOCTYPE_PUBLIC_ID, |
XML_ROLE_DOCTYPE_INTERNAL_SUBSET, |
XML_ROLE_DOCTYPE_CLOSE, |
XML_ROLE_GENERAL_ENTITY_NAME, |
XML_ROLE_PARAM_ENTITY_NAME, |
XML_ROLE_ENTITY_NONE, |
XML_ROLE_ENTITY_VALUE, |
XML_ROLE_ENTITY_SYSTEM_ID, |
XML_ROLE_ENTITY_PUBLIC_ID, |
XML_ROLE_ENTITY_COMPLETE, |
XML_ROLE_ENTITY_NOTATION_NAME, |
XML_ROLE_NOTATION_NONE, |
XML_ROLE_NOTATION_NAME, |
XML_ROLE_NOTATION_SYSTEM_ID, |
XML_ROLE_NOTATION_NO_SYSTEM_ID, |
XML_ROLE_NOTATION_PUBLIC_ID, |
XML_ROLE_ATTRIBUTE_NAME, |
XML_ROLE_ATTRIBUTE_TYPE_CDATA, |
XML_ROLE_ATTRIBUTE_TYPE_ID, |
XML_ROLE_ATTRIBUTE_TYPE_IDREF, |
XML_ROLE_ATTRIBUTE_TYPE_IDREFS, |
XML_ROLE_ATTRIBUTE_TYPE_ENTITY, |
XML_ROLE_ATTRIBUTE_TYPE_ENTITIES, |
XML_ROLE_ATTRIBUTE_TYPE_NMTOKEN, |
XML_ROLE_ATTRIBUTE_TYPE_NMTOKENS, |
XML_ROLE_ATTRIBUTE_ENUM_VALUE, |
XML_ROLE_ATTRIBUTE_NOTATION_VALUE, |
XML_ROLE_ATTLIST_NONE, |
XML_ROLE_ATTLIST_ELEMENT_NAME, |
XML_ROLE_IMPLIED_ATTRIBUTE_VALUE, |
XML_ROLE_REQUIRED_ATTRIBUTE_VALUE, |
XML_ROLE_DEFAULT_ATTRIBUTE_VALUE, |
XML_ROLE_FIXED_ATTRIBUTE_VALUE, |
XML_ROLE_ELEMENT_NONE, |
XML_ROLE_ELEMENT_NAME, |
XML_ROLE_CONTENT_ANY, |
XML_ROLE_CONTENT_EMPTY, |
XML_ROLE_CONTENT_PCDATA, |
XML_ROLE_GROUP_OPEN, |
XML_ROLE_GROUP_CLOSE, |
XML_ROLE_GROUP_CLOSE_REP, |
XML_ROLE_GROUP_CLOSE_OPT, |
XML_ROLE_GROUP_CLOSE_PLUS, |
XML_ROLE_GROUP_CHOICE, |
XML_ROLE_GROUP_SEQUENCE, |
XML_ROLE_CONTENT_ELEMENT, |
XML_ROLE_CONTENT_ELEMENT_REP, |
XML_ROLE_CONTENT_ELEMENT_OPT, |
XML_ROLE_CONTENT_ELEMENT_PLUS, |
XML_ROLE_PI, |
XML_ROLE_COMMENT, |
#ifdef XML_DTD |
XML_ROLE_TEXT_DECL, |
XML_ROLE_IGNORE_SECT, |
XML_ROLE_INNER_PARAM_ENTITY_REF, |
#endif /* XML_DTD */ |
XML_ROLE_PARAM_ENTITY_REF |
}; |
typedef struct prolog_state { |
int (PTRCALL *handler) (struct prolog_state *state, |
int tok, |
const char *ptr, |
const char *end, |
const ENCODING *enc); |
unsigned level; |
int role_none; |
#ifdef XML_DTD |
unsigned includeLevel; |
int documentEntity; |
int inEntityValue; |
#endif /* XML_DTD */ |
} PROLOG_STATE; |
void XmlPrologStateInit(PROLOG_STATE *); |
#ifdef XML_DTD |
void XmlPrologStateInitExternalEntity(PROLOG_STATE *); |
#endif /* XML_DTD */ |
#define XmlTokenRole(state, tok, ptr, end, enc) \ |
(((state)->handler)(state, tok, ptr, end, enc)) |
#ifdef __cplusplus |
} |
#endif |
#endif /* not XmlRole_INCLUDED */ |
/contrib/sdk/sources/expat/lib/xmltok.c |
---|
0,0 → 1,1651 |
/* Copyright (c) 1998, 1999 Thai Open Source Software Center Ltd |
See the file COPYING for copying permission. |
*/ |
#include <stddef.h> |
#ifdef COMPILED_FROM_DSP |
#include "winconfig.h" |
#elif defined(MACOS_CLASSIC) |
#include "macconfig.h" |
#elif defined(__amigaos__) |
#include "amigaconfig.h" |
#elif defined(__WATCOMC__) |
#include "watcomconfig.h" |
#else |
#ifdef HAVE_EXPAT_CONFIG_H |
#include <expat_config.h> |
#endif |
#endif /* ndef COMPILED_FROM_DSP */ |
#include "expat_external.h" |
#include "internal.h" |
#include "xmltok.h" |
#include "nametab.h" |
#ifdef XML_DTD |
#define IGNORE_SECTION_TOK_VTABLE , PREFIX(ignoreSectionTok) |
#else |
#define IGNORE_SECTION_TOK_VTABLE /* as nothing */ |
#endif |
#define VTABLE1 \ |
{ PREFIX(prologTok), PREFIX(contentTok), \ |
PREFIX(cdataSectionTok) IGNORE_SECTION_TOK_VTABLE }, \ |
{ PREFIX(attributeValueTok), PREFIX(entityValueTok) }, \ |
PREFIX(sameName), \ |
PREFIX(nameMatchesAscii), \ |
PREFIX(nameLength), \ |
PREFIX(skipS), \ |
PREFIX(getAtts), \ |
PREFIX(charRefNumber), \ |
PREFIX(predefinedEntityName), \ |
PREFIX(updatePosition), \ |
PREFIX(isPublicId) |
#define VTABLE VTABLE1, PREFIX(toUtf8), PREFIX(toUtf16) |
#define UCS2_GET_NAMING(pages, hi, lo) \ |
(namingBitmap[(pages[hi] << 3) + ((lo) >> 5)] & (1 << ((lo) & 0x1F))) |
/* A 2 byte UTF-8 representation splits the characters 11 bits between |
the bottom 5 and 6 bits of the bytes. We need 8 bits to index into |
pages, 3 bits to add to that index and 5 bits to generate the mask. |
*/ |
#define UTF8_GET_NAMING2(pages, byte) \ |
(namingBitmap[((pages)[(((byte)[0]) >> 2) & 7] << 3) \ |
+ ((((byte)[0]) & 3) << 1) \ |
+ ((((byte)[1]) >> 5) & 1)] \ |
& (1 << (((byte)[1]) & 0x1F))) |
/* A 3 byte UTF-8 representation splits the characters 16 bits between |
the bottom 4, 6 and 6 bits of the bytes. We need 8 bits to index |
into pages, 3 bits to add to that index and 5 bits to generate the |
mask. |
*/ |
#define UTF8_GET_NAMING3(pages, byte) \ |
(namingBitmap[((pages)[((((byte)[0]) & 0xF) << 4) \ |
+ ((((byte)[1]) >> 2) & 0xF)] \ |
<< 3) \ |
+ ((((byte)[1]) & 3) << 1) \ |
+ ((((byte)[2]) >> 5) & 1)] \ |
& (1 << (((byte)[2]) & 0x1F))) |
#define UTF8_GET_NAMING(pages, p, n) \ |
((n) == 2 \ |
? UTF8_GET_NAMING2(pages, (const unsigned char *)(p)) \ |
: ((n) == 3 \ |
? UTF8_GET_NAMING3(pages, (const unsigned char *)(p)) \ |
: 0)) |
/* Detection of invalid UTF-8 sequences is based on Table 3.1B |
of Unicode 3.2: http://www.unicode.org/unicode/reports/tr28/ |
with the additional restriction of not allowing the Unicode |
code points 0xFFFF and 0xFFFE (sequences EF,BF,BF and EF,BF,BE). |
Implementation details: |
(A & 0x80) == 0 means A < 0x80 |
and |
(A & 0xC0) == 0xC0 means A > 0xBF |
*/ |
#define UTF8_INVALID2(p) \ |
((*p) < 0xC2 || ((p)[1] & 0x80) == 0 || ((p)[1] & 0xC0) == 0xC0) |
#define UTF8_INVALID3(p) \ |
(((p)[2] & 0x80) == 0 \ |
|| \ |
((*p) == 0xEF && (p)[1] == 0xBF \ |
? \ |
(p)[2] > 0xBD \ |
: \ |
((p)[2] & 0xC0) == 0xC0) \ |
|| \ |
((*p) == 0xE0 \ |
? \ |
(p)[1] < 0xA0 || ((p)[1] & 0xC0) == 0xC0 \ |
: \ |
((p)[1] & 0x80) == 0 \ |
|| \ |
((*p) == 0xED ? (p)[1] > 0x9F : ((p)[1] & 0xC0) == 0xC0))) |
#define UTF8_INVALID4(p) \ |
(((p)[3] & 0x80) == 0 || ((p)[3] & 0xC0) == 0xC0 \ |
|| \ |
((p)[2] & 0x80) == 0 || ((p)[2] & 0xC0) == 0xC0 \ |
|| \ |
((*p) == 0xF0 \ |
? \ |
(p)[1] < 0x90 || ((p)[1] & 0xC0) == 0xC0 \ |
: \ |
((p)[1] & 0x80) == 0 \ |
|| \ |
((*p) == 0xF4 ? (p)[1] > 0x8F : ((p)[1] & 0xC0) == 0xC0))) |
static int PTRFASTCALL |
isNever(const ENCODING *enc, const char *p) |
{ |
return 0; |
} |
static int PTRFASTCALL |
utf8_isName2(const ENCODING *enc, const char *p) |
{ |
return UTF8_GET_NAMING2(namePages, (const unsigned char *)p); |
} |
static int PTRFASTCALL |
utf8_isName3(const ENCODING *enc, const char *p) |
{ |
return UTF8_GET_NAMING3(namePages, (const unsigned char *)p); |
} |
#define utf8_isName4 isNever |
static int PTRFASTCALL |
utf8_isNmstrt2(const ENCODING *enc, const char *p) |
{ |
return UTF8_GET_NAMING2(nmstrtPages, (const unsigned char *)p); |
} |
static int PTRFASTCALL |
utf8_isNmstrt3(const ENCODING *enc, const char *p) |
{ |
return UTF8_GET_NAMING3(nmstrtPages, (const unsigned char *)p); |
} |
#define utf8_isNmstrt4 isNever |
static int PTRFASTCALL |
utf8_isInvalid2(const ENCODING *enc, const char *p) |
{ |
return UTF8_INVALID2((const unsigned char *)p); |
} |
static int PTRFASTCALL |
utf8_isInvalid3(const ENCODING *enc, const char *p) |
{ |
return UTF8_INVALID3((const unsigned char *)p); |
} |
static int PTRFASTCALL |
utf8_isInvalid4(const ENCODING *enc, const char *p) |
{ |
return UTF8_INVALID4((const unsigned char *)p); |
} |
struct normal_encoding { |
ENCODING enc; |
unsigned char type[256]; |
#ifdef XML_MIN_SIZE |
int (PTRFASTCALL *byteType)(const ENCODING *, const char *); |
int (PTRFASTCALL *isNameMin)(const ENCODING *, const char *); |
int (PTRFASTCALL *isNmstrtMin)(const ENCODING *, const char *); |
int (PTRFASTCALL *byteToAscii)(const ENCODING *, const char *); |
int (PTRCALL *charMatches)(const ENCODING *, const char *, int); |
#endif /* XML_MIN_SIZE */ |
int (PTRFASTCALL *isName2)(const ENCODING *, const char *); |
int (PTRFASTCALL *isName3)(const ENCODING *, const char *); |
int (PTRFASTCALL *isName4)(const ENCODING *, const char *); |
int (PTRFASTCALL *isNmstrt2)(const ENCODING *, const char *); |
int (PTRFASTCALL *isNmstrt3)(const ENCODING *, const char *); |
int (PTRFASTCALL *isNmstrt4)(const ENCODING *, const char *); |
int (PTRFASTCALL *isInvalid2)(const ENCODING *, const char *); |
int (PTRFASTCALL *isInvalid3)(const ENCODING *, const char *); |
int (PTRFASTCALL *isInvalid4)(const ENCODING *, const char *); |
}; |
#define AS_NORMAL_ENCODING(enc) ((const struct normal_encoding *) (enc)) |
#ifdef XML_MIN_SIZE |
#define STANDARD_VTABLE(E) \ |
E ## byteType, \ |
E ## isNameMin, \ |
E ## isNmstrtMin, \ |
E ## byteToAscii, \ |
E ## charMatches, |
#else |
#define STANDARD_VTABLE(E) /* as nothing */ |
#endif |
#define NORMAL_VTABLE(E) \ |
E ## isName2, \ |
E ## isName3, \ |
E ## isName4, \ |
E ## isNmstrt2, \ |
E ## isNmstrt3, \ |
E ## isNmstrt4, \ |
E ## isInvalid2, \ |
E ## isInvalid3, \ |
E ## isInvalid4 |
static int FASTCALL checkCharRefNumber(int); |
#include "xmltok_impl.h" |
#include "ascii.h" |
#ifdef XML_MIN_SIZE |
#define sb_isNameMin isNever |
#define sb_isNmstrtMin isNever |
#endif |
#ifdef XML_MIN_SIZE |
#define MINBPC(enc) ((enc)->minBytesPerChar) |
#else |
/* minimum bytes per character */ |
#define MINBPC(enc) 1 |
#endif |
#define SB_BYTE_TYPE(enc, p) \ |
(((struct normal_encoding *)(enc))->type[(unsigned char)*(p)]) |
#ifdef XML_MIN_SIZE |
static int PTRFASTCALL |
sb_byteType(const ENCODING *enc, const char *p) |
{ |
return SB_BYTE_TYPE(enc, p); |
} |
#define BYTE_TYPE(enc, p) \ |
(AS_NORMAL_ENCODING(enc)->byteType(enc, p)) |
#else |
#define BYTE_TYPE(enc, p) SB_BYTE_TYPE(enc, p) |
#endif |
#ifdef XML_MIN_SIZE |
#define BYTE_TO_ASCII(enc, p) \ |
(AS_NORMAL_ENCODING(enc)->byteToAscii(enc, p)) |
static int PTRFASTCALL |
sb_byteToAscii(const ENCODING *enc, const char *p) |
{ |
return *p; |
} |
#else |
#define BYTE_TO_ASCII(enc, p) (*(p)) |
#endif |
#define IS_NAME_CHAR(enc, p, n) \ |
(AS_NORMAL_ENCODING(enc)->isName ## n(enc, p)) |
#define IS_NMSTRT_CHAR(enc, p, n) \ |
(AS_NORMAL_ENCODING(enc)->isNmstrt ## n(enc, p)) |
#define IS_INVALID_CHAR(enc, p, n) \ |
(AS_NORMAL_ENCODING(enc)->isInvalid ## n(enc, p)) |
#ifdef XML_MIN_SIZE |
#define IS_NAME_CHAR_MINBPC(enc, p) \ |
(AS_NORMAL_ENCODING(enc)->isNameMin(enc, p)) |
#define IS_NMSTRT_CHAR_MINBPC(enc, p) \ |
(AS_NORMAL_ENCODING(enc)->isNmstrtMin(enc, p)) |
#else |
#define IS_NAME_CHAR_MINBPC(enc, p) (0) |
#define IS_NMSTRT_CHAR_MINBPC(enc, p) (0) |
#endif |
#ifdef XML_MIN_SIZE |
#define CHAR_MATCHES(enc, p, c) \ |
(AS_NORMAL_ENCODING(enc)->charMatches(enc, p, c)) |
static int PTRCALL |
sb_charMatches(const ENCODING *enc, const char *p, int c) |
{ |
return *p == c; |
} |
#else |
/* c is an ASCII character */ |
#define CHAR_MATCHES(enc, p, c) (*(p) == c) |
#endif |
#define PREFIX(ident) normal_ ## ident |
#define XML_TOK_IMPL_C |
#include "xmltok_impl.c" |
#undef XML_TOK_IMPL_C |
#undef MINBPC |
#undef BYTE_TYPE |
#undef BYTE_TO_ASCII |
#undef CHAR_MATCHES |
#undef IS_NAME_CHAR |
#undef IS_NAME_CHAR_MINBPC |
#undef IS_NMSTRT_CHAR |
#undef IS_NMSTRT_CHAR_MINBPC |
#undef IS_INVALID_CHAR |
enum { /* UTF8_cvalN is value of masked first byte of N byte sequence */ |
UTF8_cval1 = 0x00, |
UTF8_cval2 = 0xc0, |
UTF8_cval3 = 0xe0, |
UTF8_cval4 = 0xf0 |
}; |
static void PTRCALL |
utf8_toUtf8(const ENCODING *enc, |
const char **fromP, const char *fromLim, |
char **toP, const char *toLim) |
{ |
char *to; |
const char *from; |
if (fromLim - *fromP > toLim - *toP) { |
/* Avoid copying partial characters. */ |
for (fromLim = *fromP + (toLim - *toP); fromLim > *fromP; fromLim--) |
if (((unsigned char)fromLim[-1] & 0xc0) != 0x80) |
break; |
} |
for (to = *toP, from = *fromP; from != fromLim; from++, to++) |
*to = *from; |
*fromP = from; |
*toP = to; |
} |
static void PTRCALL |
utf8_toUtf16(const ENCODING *enc, |
const char **fromP, const char *fromLim, |
unsigned short **toP, const unsigned short *toLim) |
{ |
unsigned short *to = *toP; |
const char *from = *fromP; |
while (from != fromLim && to != toLim) { |
switch (((struct normal_encoding *)enc)->type[(unsigned char)*from]) { |
case BT_LEAD2: |
*to++ = (unsigned short)(((from[0] & 0x1f) << 6) | (from[1] & 0x3f)); |
from += 2; |
break; |
case BT_LEAD3: |
*to++ = (unsigned short)(((from[0] & 0xf) << 12) |
| ((from[1] & 0x3f) << 6) | (from[2] & 0x3f)); |
from += 3; |
break; |
case BT_LEAD4: |
{ |
unsigned long n; |
if (to + 1 == toLim) |
goto after; |
n = ((from[0] & 0x7) << 18) | ((from[1] & 0x3f) << 12) |
| ((from[2] & 0x3f) << 6) | (from[3] & 0x3f); |
n -= 0x10000; |
to[0] = (unsigned short)((n >> 10) | 0xD800); |
to[1] = (unsigned short)((n & 0x3FF) | 0xDC00); |
to += 2; |
from += 4; |
} |
break; |
default: |
*to++ = *from++; |
break; |
} |
} |
after: |
*fromP = from; |
*toP = to; |
} |
#ifdef XML_NS |
static const struct normal_encoding utf8_encoding_ns = { |
{ VTABLE1, utf8_toUtf8, utf8_toUtf16, 1, 1, 0 }, |
{ |
#include "asciitab.h" |
#include "utf8tab.h" |
}, |
STANDARD_VTABLE(sb_) NORMAL_VTABLE(utf8_) |
}; |
#endif |
static const struct normal_encoding utf8_encoding = { |
{ VTABLE1, utf8_toUtf8, utf8_toUtf16, 1, 1, 0 }, |
{ |
#define BT_COLON BT_NMSTRT |
#include "asciitab.h" |
#undef BT_COLON |
#include "utf8tab.h" |
}, |
STANDARD_VTABLE(sb_) NORMAL_VTABLE(utf8_) |
}; |
#ifdef XML_NS |
static const struct normal_encoding internal_utf8_encoding_ns = { |
{ VTABLE1, utf8_toUtf8, utf8_toUtf16, 1, 1, 0 }, |
{ |
#include "iasciitab.h" |
#include "utf8tab.h" |
}, |
STANDARD_VTABLE(sb_) NORMAL_VTABLE(utf8_) |
}; |
#endif |
static const struct normal_encoding internal_utf8_encoding = { |
{ VTABLE1, utf8_toUtf8, utf8_toUtf16, 1, 1, 0 }, |
{ |
#define BT_COLON BT_NMSTRT |
#include "iasciitab.h" |
#undef BT_COLON |
#include "utf8tab.h" |
}, |
STANDARD_VTABLE(sb_) NORMAL_VTABLE(utf8_) |
}; |
static void PTRCALL |
latin1_toUtf8(const ENCODING *enc, |
const char **fromP, const char *fromLim, |
char **toP, const char *toLim) |
{ |
for (;;) { |
unsigned char c; |
if (*fromP == fromLim) |
break; |
c = (unsigned char)**fromP; |
if (c & 0x80) { |
if (toLim - *toP < 2) |
break; |
*(*toP)++ = (char)((c >> 6) | UTF8_cval2); |
*(*toP)++ = (char)((c & 0x3f) | 0x80); |
(*fromP)++; |
} |
else { |
if (*toP == toLim) |
break; |
*(*toP)++ = *(*fromP)++; |
} |
} |
} |
static void PTRCALL |
latin1_toUtf16(const ENCODING *enc, |
const char **fromP, const char *fromLim, |
unsigned short **toP, const unsigned short *toLim) |
{ |
while (*fromP != fromLim && *toP != toLim) |
*(*toP)++ = (unsigned char)*(*fromP)++; |
} |
#ifdef XML_NS |
static const struct normal_encoding latin1_encoding_ns = { |
{ VTABLE1, latin1_toUtf8, latin1_toUtf16, 1, 0, 0 }, |
{ |
#include "asciitab.h" |
#include "latin1tab.h" |
}, |
STANDARD_VTABLE(sb_) |
}; |
#endif |
static const struct normal_encoding latin1_encoding = { |
{ VTABLE1, latin1_toUtf8, latin1_toUtf16, 1, 0, 0 }, |
{ |
#define BT_COLON BT_NMSTRT |
#include "asciitab.h" |
#undef BT_COLON |
#include "latin1tab.h" |
}, |
STANDARD_VTABLE(sb_) |
}; |
static void PTRCALL |
ascii_toUtf8(const ENCODING *enc, |
const char **fromP, const char *fromLim, |
char **toP, const char *toLim) |
{ |
while (*fromP != fromLim && *toP != toLim) |
*(*toP)++ = *(*fromP)++; |
} |
#ifdef XML_NS |
static const struct normal_encoding ascii_encoding_ns = { |
{ VTABLE1, ascii_toUtf8, latin1_toUtf16, 1, 1, 0 }, |
{ |
#include "asciitab.h" |
/* BT_NONXML == 0 */ |
}, |
STANDARD_VTABLE(sb_) |
}; |
#endif |
static const struct normal_encoding ascii_encoding = { |
{ VTABLE1, ascii_toUtf8, latin1_toUtf16, 1, 1, 0 }, |
{ |
#define BT_COLON BT_NMSTRT |
#include "asciitab.h" |
#undef BT_COLON |
/* BT_NONXML == 0 */ |
}, |
STANDARD_VTABLE(sb_) |
}; |
static int PTRFASTCALL |
unicode_byte_type(char hi, char lo) |
{ |
switch ((unsigned char)hi) { |
case 0xD8: case 0xD9: case 0xDA: case 0xDB: |
return BT_LEAD4; |
case 0xDC: case 0xDD: case 0xDE: case 0xDF: |
return BT_TRAIL; |
case 0xFF: |
switch ((unsigned char)lo) { |
case 0xFF: |
case 0xFE: |
return BT_NONXML; |
} |
break; |
} |
return BT_NONASCII; |
} |
#define DEFINE_UTF16_TO_UTF8(E) \ |
static void PTRCALL \ |
E ## toUtf8(const ENCODING *enc, \ |
const char **fromP, const char *fromLim, \ |
char **toP, const char *toLim) \ |
{ \ |
const char *from; \ |
for (from = *fromP; from != fromLim; from += 2) { \ |
int plane; \ |
unsigned char lo2; \ |
unsigned char lo = GET_LO(from); \ |
unsigned char hi = GET_HI(from); \ |
switch (hi) { \ |
case 0: \ |
if (lo < 0x80) { \ |
if (*toP == toLim) { \ |
*fromP = from; \ |
return; \ |
} \ |
*(*toP)++ = lo; \ |
break; \ |
} \ |
/* fall through */ \ |
case 0x1: case 0x2: case 0x3: \ |
case 0x4: case 0x5: case 0x6: case 0x7: \ |
if (toLim - *toP < 2) { \ |
*fromP = from; \ |
return; \ |
} \ |
*(*toP)++ = ((lo >> 6) | (hi << 2) | UTF8_cval2); \ |
*(*toP)++ = ((lo & 0x3f) | 0x80); \ |
break; \ |
default: \ |
if (toLim - *toP < 3) { \ |
*fromP = from; \ |
return; \ |
} \ |
/* 16 bits divided 4, 6, 6 amongst 3 bytes */ \ |
*(*toP)++ = ((hi >> 4) | UTF8_cval3); \ |
*(*toP)++ = (((hi & 0xf) << 2) | (lo >> 6) | 0x80); \ |
*(*toP)++ = ((lo & 0x3f) | 0x80); \ |
break; \ |
case 0xD8: case 0xD9: case 0xDA: case 0xDB: \ |
if (toLim - *toP < 4) { \ |
*fromP = from; \ |
return; \ |
} \ |
plane = (((hi & 0x3) << 2) | ((lo >> 6) & 0x3)) + 1; \ |
*(*toP)++ = ((plane >> 2) | UTF8_cval4); \ |
*(*toP)++ = (((lo >> 2) & 0xF) | ((plane & 0x3) << 4) | 0x80); \ |
from += 2; \ |
lo2 = GET_LO(from); \ |
*(*toP)++ = (((lo & 0x3) << 4) \ |
| ((GET_HI(from) & 0x3) << 2) \ |
| (lo2 >> 6) \ |
| 0x80); \ |
*(*toP)++ = ((lo2 & 0x3f) | 0x80); \ |
break; \ |
} \ |
} \ |
*fromP = from; \ |
} |
#define DEFINE_UTF16_TO_UTF16(E) \ |
static void PTRCALL \ |
E ## toUtf16(const ENCODING *enc, \ |
const char **fromP, const char *fromLim, \ |
unsigned short **toP, const unsigned short *toLim) \ |
{ \ |
/* Avoid copying first half only of surrogate */ \ |
if (fromLim - *fromP > ((toLim - *toP) << 1) \ |
&& (GET_HI(fromLim - 2) & 0xF8) == 0xD8) \ |
fromLim -= 2; \ |
for (; *fromP != fromLim && *toP != toLim; *fromP += 2) \ |
*(*toP)++ = (GET_HI(*fromP) << 8) | GET_LO(*fromP); \ |
} |
#define SET2(ptr, ch) \ |
(((ptr)[0] = ((ch) & 0xff)), ((ptr)[1] = ((ch) >> 8))) |
#define GET_LO(ptr) ((unsigned char)(ptr)[0]) |
#define GET_HI(ptr) ((unsigned char)(ptr)[1]) |
DEFINE_UTF16_TO_UTF8(little2_) |
DEFINE_UTF16_TO_UTF16(little2_) |
#undef SET2 |
#undef GET_LO |
#undef GET_HI |
#define SET2(ptr, ch) \ |
(((ptr)[0] = ((ch) >> 8)), ((ptr)[1] = ((ch) & 0xFF))) |
#define GET_LO(ptr) ((unsigned char)(ptr)[1]) |
#define GET_HI(ptr) ((unsigned char)(ptr)[0]) |
DEFINE_UTF16_TO_UTF8(big2_) |
DEFINE_UTF16_TO_UTF16(big2_) |
#undef SET2 |
#undef GET_LO |
#undef GET_HI |
#define LITTLE2_BYTE_TYPE(enc, p) \ |
((p)[1] == 0 \ |
? ((struct normal_encoding *)(enc))->type[(unsigned char)*(p)] \ |
: unicode_byte_type((p)[1], (p)[0])) |
#define LITTLE2_BYTE_TO_ASCII(enc, p) ((p)[1] == 0 ? (p)[0] : -1) |
#define LITTLE2_CHAR_MATCHES(enc, p, c) ((p)[1] == 0 && (p)[0] == c) |
#define LITTLE2_IS_NAME_CHAR_MINBPC(enc, p) \ |
UCS2_GET_NAMING(namePages, (unsigned char)p[1], (unsigned char)p[0]) |
#define LITTLE2_IS_NMSTRT_CHAR_MINBPC(enc, p) \ |
UCS2_GET_NAMING(nmstrtPages, (unsigned char)p[1], (unsigned char)p[0]) |
#ifdef XML_MIN_SIZE |
static int PTRFASTCALL |
little2_byteType(const ENCODING *enc, const char *p) |
{ |
return LITTLE2_BYTE_TYPE(enc, p); |
} |
static int PTRFASTCALL |
little2_byteToAscii(const ENCODING *enc, const char *p) |
{ |
return LITTLE2_BYTE_TO_ASCII(enc, p); |
} |
static int PTRCALL |
little2_charMatches(const ENCODING *enc, const char *p, int c) |
{ |
return LITTLE2_CHAR_MATCHES(enc, p, c); |
} |
static int PTRFASTCALL |
little2_isNameMin(const ENCODING *enc, const char *p) |
{ |
return LITTLE2_IS_NAME_CHAR_MINBPC(enc, p); |
} |
static int PTRFASTCALL |
little2_isNmstrtMin(const ENCODING *enc, const char *p) |
{ |
return LITTLE2_IS_NMSTRT_CHAR_MINBPC(enc, p); |
} |
#undef VTABLE |
#define VTABLE VTABLE1, little2_toUtf8, little2_toUtf16 |
#else /* not XML_MIN_SIZE */ |
#undef PREFIX |
#define PREFIX(ident) little2_ ## ident |
#define MINBPC(enc) 2 |
/* CHAR_MATCHES is guaranteed to have MINBPC bytes available. */ |
#define BYTE_TYPE(enc, p) LITTLE2_BYTE_TYPE(enc, p) |
#define BYTE_TO_ASCII(enc, p) LITTLE2_BYTE_TO_ASCII(enc, p) |
#define CHAR_MATCHES(enc, p, c) LITTLE2_CHAR_MATCHES(enc, p, c) |
#define IS_NAME_CHAR(enc, p, n) 0 |
#define IS_NAME_CHAR_MINBPC(enc, p) LITTLE2_IS_NAME_CHAR_MINBPC(enc, p) |
#define IS_NMSTRT_CHAR(enc, p, n) (0) |
#define IS_NMSTRT_CHAR_MINBPC(enc, p) LITTLE2_IS_NMSTRT_CHAR_MINBPC(enc, p) |
#define XML_TOK_IMPL_C |
#include "xmltok_impl.c" |
#undef XML_TOK_IMPL_C |
#undef MINBPC |
#undef BYTE_TYPE |
#undef BYTE_TO_ASCII |
#undef CHAR_MATCHES |
#undef IS_NAME_CHAR |
#undef IS_NAME_CHAR_MINBPC |
#undef IS_NMSTRT_CHAR |
#undef IS_NMSTRT_CHAR_MINBPC |
#undef IS_INVALID_CHAR |
#endif /* not XML_MIN_SIZE */ |
#ifdef XML_NS |
static const struct normal_encoding little2_encoding_ns = { |
{ VTABLE, 2, 0, |
#if BYTEORDER == 1234 |
1 |
#else |
0 |
#endif |
}, |
{ |
#include "asciitab.h" |
#include "latin1tab.h" |
}, |
STANDARD_VTABLE(little2_) |
}; |
#endif |
static const struct normal_encoding little2_encoding = { |
{ VTABLE, 2, 0, |
#if BYTEORDER == 1234 |
1 |
#else |
0 |
#endif |
}, |
{ |
#define BT_COLON BT_NMSTRT |
#include "asciitab.h" |
#undef BT_COLON |
#include "latin1tab.h" |
}, |
STANDARD_VTABLE(little2_) |
}; |
#if BYTEORDER != 4321 |
#ifdef XML_NS |
static const struct normal_encoding internal_little2_encoding_ns = { |
{ VTABLE, 2, 0, 1 }, |
{ |
#include "iasciitab.h" |
#include "latin1tab.h" |
}, |
STANDARD_VTABLE(little2_) |
}; |
#endif |
static const struct normal_encoding internal_little2_encoding = { |
{ VTABLE, 2, 0, 1 }, |
{ |
#define BT_COLON BT_NMSTRT |
#include "iasciitab.h" |
#undef BT_COLON |
#include "latin1tab.h" |
}, |
STANDARD_VTABLE(little2_) |
}; |
#endif |
#define BIG2_BYTE_TYPE(enc, p) \ |
((p)[0] == 0 \ |
? ((struct normal_encoding *)(enc))->type[(unsigned char)(p)[1]] \ |
: unicode_byte_type((p)[0], (p)[1])) |
#define BIG2_BYTE_TO_ASCII(enc, p) ((p)[0] == 0 ? (p)[1] : -1) |
#define BIG2_CHAR_MATCHES(enc, p, c) ((p)[0] == 0 && (p)[1] == c) |
#define BIG2_IS_NAME_CHAR_MINBPC(enc, p) \ |
UCS2_GET_NAMING(namePages, (unsigned char)p[0], (unsigned char)p[1]) |
#define BIG2_IS_NMSTRT_CHAR_MINBPC(enc, p) \ |
UCS2_GET_NAMING(nmstrtPages, (unsigned char)p[0], (unsigned char)p[1]) |
#ifdef XML_MIN_SIZE |
static int PTRFASTCALL |
big2_byteType(const ENCODING *enc, const char *p) |
{ |
return BIG2_BYTE_TYPE(enc, p); |
} |
static int PTRFASTCALL |
big2_byteToAscii(const ENCODING *enc, const char *p) |
{ |
return BIG2_BYTE_TO_ASCII(enc, p); |
} |
static int PTRCALL |
big2_charMatches(const ENCODING *enc, const char *p, int c) |
{ |
return BIG2_CHAR_MATCHES(enc, p, c); |
} |
static int PTRFASTCALL |
big2_isNameMin(const ENCODING *enc, const char *p) |
{ |
return BIG2_IS_NAME_CHAR_MINBPC(enc, p); |
} |
static int PTRFASTCALL |
big2_isNmstrtMin(const ENCODING *enc, const char *p) |
{ |
return BIG2_IS_NMSTRT_CHAR_MINBPC(enc, p); |
} |
#undef VTABLE |
#define VTABLE VTABLE1, big2_toUtf8, big2_toUtf16 |
#else /* not XML_MIN_SIZE */ |
#undef PREFIX |
#define PREFIX(ident) big2_ ## ident |
#define MINBPC(enc) 2 |
/* CHAR_MATCHES is guaranteed to have MINBPC bytes available. */ |
#define BYTE_TYPE(enc, p) BIG2_BYTE_TYPE(enc, p) |
#define BYTE_TO_ASCII(enc, p) BIG2_BYTE_TO_ASCII(enc, p) |
#define CHAR_MATCHES(enc, p, c) BIG2_CHAR_MATCHES(enc, p, c) |
#define IS_NAME_CHAR(enc, p, n) 0 |
#define IS_NAME_CHAR_MINBPC(enc, p) BIG2_IS_NAME_CHAR_MINBPC(enc, p) |
#define IS_NMSTRT_CHAR(enc, p, n) (0) |
#define IS_NMSTRT_CHAR_MINBPC(enc, p) BIG2_IS_NMSTRT_CHAR_MINBPC(enc, p) |
#define XML_TOK_IMPL_C |
#include "xmltok_impl.c" |
#undef XML_TOK_IMPL_C |
#undef MINBPC |
#undef BYTE_TYPE |
#undef BYTE_TO_ASCII |
#undef CHAR_MATCHES |
#undef IS_NAME_CHAR |
#undef IS_NAME_CHAR_MINBPC |
#undef IS_NMSTRT_CHAR |
#undef IS_NMSTRT_CHAR_MINBPC |
#undef IS_INVALID_CHAR |
#endif /* not XML_MIN_SIZE */ |
#ifdef XML_NS |
static const struct normal_encoding big2_encoding_ns = { |
{ VTABLE, 2, 0, |
#if BYTEORDER == 4321 |
1 |
#else |
0 |
#endif |
}, |
{ |
#include "asciitab.h" |
#include "latin1tab.h" |
}, |
STANDARD_VTABLE(big2_) |
}; |
#endif |
static const struct normal_encoding big2_encoding = { |
{ VTABLE, 2, 0, |
#if BYTEORDER == 4321 |
1 |
#else |
0 |
#endif |
}, |
{ |
#define BT_COLON BT_NMSTRT |
#include "asciitab.h" |
#undef BT_COLON |
#include "latin1tab.h" |
}, |
STANDARD_VTABLE(big2_) |
}; |
#if BYTEORDER != 1234 |
#ifdef XML_NS |
static const struct normal_encoding internal_big2_encoding_ns = { |
{ VTABLE, 2, 0, 1 }, |
{ |
#include "iasciitab.h" |
#include "latin1tab.h" |
}, |
STANDARD_VTABLE(big2_) |
}; |
#endif |
static const struct normal_encoding internal_big2_encoding = { |
{ VTABLE, 2, 0, 1 }, |
{ |
#define BT_COLON BT_NMSTRT |
#include "iasciitab.h" |
#undef BT_COLON |
#include "latin1tab.h" |
}, |
STANDARD_VTABLE(big2_) |
}; |
#endif |
#undef PREFIX |
static int FASTCALL |
streqci(const char *s1, const char *s2) |
{ |
for (;;) { |
char c1 = *s1++; |
char c2 = *s2++; |
if (ASCII_a <= c1 && c1 <= ASCII_z) |
c1 += ASCII_A - ASCII_a; |
if (ASCII_a <= c2 && c2 <= ASCII_z) |
c2 += ASCII_A - ASCII_a; |
if (c1 != c2) |
return 0; |
if (!c1) |
break; |
} |
return 1; |
} |
static void PTRCALL |
initUpdatePosition(const ENCODING *enc, const char *ptr, |
const char *end, POSITION *pos) |
{ |
normal_updatePosition(&utf8_encoding.enc, ptr, end, pos); |
} |
static int |
toAscii(const ENCODING *enc, const char *ptr, const char *end) |
{ |
char buf[1]; |
char *p = buf; |
XmlUtf8Convert(enc, &ptr, end, &p, p + 1); |
if (p == buf) |
return -1; |
else |
return buf[0]; |
} |
static int FASTCALL |
isSpace(int c) |
{ |
switch (c) { |
case 0x20: |
case 0xD: |
case 0xA: |
case 0x9: |
return 1; |
} |
return 0; |
} |
/* Return 1 if there's just optional white space or there's an S |
followed by name=val. |
*/ |
static int |
parsePseudoAttribute(const ENCODING *enc, |
const char *ptr, |
const char *end, |
const char **namePtr, |
const char **nameEndPtr, |
const char **valPtr, |
const char **nextTokPtr) |
{ |
int c; |
char open; |
if (ptr == end) { |
*namePtr = NULL; |
return 1; |
} |
if (!isSpace(toAscii(enc, ptr, end))) { |
*nextTokPtr = ptr; |
return 0; |
} |
do { |
ptr += enc->minBytesPerChar; |
} while (isSpace(toAscii(enc, ptr, end))); |
if (ptr == end) { |
*namePtr = NULL; |
return 1; |
} |
*namePtr = ptr; |
for (;;) { |
c = toAscii(enc, ptr, end); |
if (c == -1) { |
*nextTokPtr = ptr; |
return 0; |
} |
if (c == ASCII_EQUALS) { |
*nameEndPtr = ptr; |
break; |
} |
if (isSpace(c)) { |
*nameEndPtr = ptr; |
do { |
ptr += enc->minBytesPerChar; |
} while (isSpace(c = toAscii(enc, ptr, end))); |
if (c != ASCII_EQUALS) { |
*nextTokPtr = ptr; |
return 0; |
} |
break; |
} |
ptr += enc->minBytesPerChar; |
} |
if (ptr == *namePtr) { |
*nextTokPtr = ptr; |
return 0; |
} |
ptr += enc->minBytesPerChar; |
c = toAscii(enc, ptr, end); |
while (isSpace(c)) { |
ptr += enc->minBytesPerChar; |
c = toAscii(enc, ptr, end); |
} |
if (c != ASCII_QUOT && c != ASCII_APOS) { |
*nextTokPtr = ptr; |
return 0; |
} |
open = (char)c; |
ptr += enc->minBytesPerChar; |
*valPtr = ptr; |
for (;; ptr += enc->minBytesPerChar) { |
c = toAscii(enc, ptr, end); |
if (c == open) |
break; |
if (!(ASCII_a <= c && c <= ASCII_z) |
&& !(ASCII_A <= c && c <= ASCII_Z) |
&& !(ASCII_0 <= c && c <= ASCII_9) |
&& c != ASCII_PERIOD |
&& c != ASCII_MINUS |
&& c != ASCII_UNDERSCORE) { |
*nextTokPtr = ptr; |
return 0; |
} |
} |
*nextTokPtr = ptr + enc->minBytesPerChar; |
return 1; |
} |
static const char KW_version[] = { |
ASCII_v, ASCII_e, ASCII_r, ASCII_s, ASCII_i, ASCII_o, ASCII_n, '\0' |
}; |
static const char KW_encoding[] = { |
ASCII_e, ASCII_n, ASCII_c, ASCII_o, ASCII_d, ASCII_i, ASCII_n, ASCII_g, '\0' |
}; |
static const char KW_standalone[] = { |
ASCII_s, ASCII_t, ASCII_a, ASCII_n, ASCII_d, ASCII_a, ASCII_l, ASCII_o, |
ASCII_n, ASCII_e, '\0' |
}; |
static const char KW_yes[] = { |
ASCII_y, ASCII_e, ASCII_s, '\0' |
}; |
static const char KW_no[] = { |
ASCII_n, ASCII_o, '\0' |
}; |
static int |
doParseXmlDecl(const ENCODING *(*encodingFinder)(const ENCODING *, |
const char *, |
const char *), |
int isGeneralTextEntity, |
const ENCODING *enc, |
const char *ptr, |
const char *end, |
const char **badPtr, |
const char **versionPtr, |
const char **versionEndPtr, |
const char **encodingName, |
const ENCODING **encoding, |
int *standalone) |
{ |
const char *val = NULL; |
const char *name = NULL; |
const char *nameEnd = NULL; |
ptr += 5 * enc->minBytesPerChar; |
end -= 2 * enc->minBytesPerChar; |
if (!parsePseudoAttribute(enc, ptr, end, &name, &nameEnd, &val, &ptr) |
|| !name) { |
*badPtr = ptr; |
return 0; |
} |
if (!XmlNameMatchesAscii(enc, name, nameEnd, KW_version)) { |
if (!isGeneralTextEntity) { |
*badPtr = name; |
return 0; |
} |
} |
else { |
if (versionPtr) |
*versionPtr = val; |
if (versionEndPtr) |
*versionEndPtr = ptr; |
if (!parsePseudoAttribute(enc, ptr, end, &name, &nameEnd, &val, &ptr)) { |
*badPtr = ptr; |
return 0; |
} |
if (!name) { |
if (isGeneralTextEntity) { |
/* a TextDecl must have an EncodingDecl */ |
*badPtr = ptr; |
return 0; |
} |
return 1; |
} |
} |
if (XmlNameMatchesAscii(enc, name, nameEnd, KW_encoding)) { |
int c = toAscii(enc, val, end); |
if (!(ASCII_a <= c && c <= ASCII_z) && !(ASCII_A <= c && c <= ASCII_Z)) { |
*badPtr = val; |
return 0; |
} |
if (encodingName) |
*encodingName = val; |
if (encoding) |
*encoding = encodingFinder(enc, val, ptr - enc->minBytesPerChar); |
if (!parsePseudoAttribute(enc, ptr, end, &name, &nameEnd, &val, &ptr)) { |
*badPtr = ptr; |
return 0; |
} |
if (!name) |
return 1; |
} |
if (!XmlNameMatchesAscii(enc, name, nameEnd, KW_standalone) |
|| isGeneralTextEntity) { |
*badPtr = name; |
return 0; |
} |
if (XmlNameMatchesAscii(enc, val, ptr - enc->minBytesPerChar, KW_yes)) { |
if (standalone) |
*standalone = 1; |
} |
else if (XmlNameMatchesAscii(enc, val, ptr - enc->minBytesPerChar, KW_no)) { |
if (standalone) |
*standalone = 0; |
} |
else { |
*badPtr = val; |
return 0; |
} |
while (isSpace(toAscii(enc, ptr, end))) |
ptr += enc->minBytesPerChar; |
if (ptr != end) { |
*badPtr = ptr; |
return 0; |
} |
return 1; |
} |
static int FASTCALL |
checkCharRefNumber(int result) |
{ |
switch (result >> 8) { |
case 0xD8: case 0xD9: case 0xDA: case 0xDB: |
case 0xDC: case 0xDD: case 0xDE: case 0xDF: |
return -1; |
case 0: |
if (latin1_encoding.type[result] == BT_NONXML) |
return -1; |
break; |
case 0xFF: |
if (result == 0xFFFE || result == 0xFFFF) |
return -1; |
break; |
} |
return result; |
} |
int FASTCALL |
XmlUtf8Encode(int c, char *buf) |
{ |
enum { |
/* minN is minimum legal resulting value for N byte sequence */ |
min2 = 0x80, |
min3 = 0x800, |
min4 = 0x10000 |
}; |
if (c < 0) |
return 0; |
if (c < min2) { |
buf[0] = (char)(c | UTF8_cval1); |
return 1; |
} |
if (c < min3) { |
buf[0] = (char)((c >> 6) | UTF8_cval2); |
buf[1] = (char)((c & 0x3f) | 0x80); |
return 2; |
} |
if (c < min4) { |
buf[0] = (char)((c >> 12) | UTF8_cval3); |
buf[1] = (char)(((c >> 6) & 0x3f) | 0x80); |
buf[2] = (char)((c & 0x3f) | 0x80); |
return 3; |
} |
if (c < 0x110000) { |
buf[0] = (char)((c >> 18) | UTF8_cval4); |
buf[1] = (char)(((c >> 12) & 0x3f) | 0x80); |
buf[2] = (char)(((c >> 6) & 0x3f) | 0x80); |
buf[3] = (char)((c & 0x3f) | 0x80); |
return 4; |
} |
return 0; |
} |
int FASTCALL |
XmlUtf16Encode(int charNum, unsigned short *buf) |
{ |
if (charNum < 0) |
return 0; |
if (charNum < 0x10000) { |
buf[0] = (unsigned short)charNum; |
return 1; |
} |
if (charNum < 0x110000) { |
charNum -= 0x10000; |
buf[0] = (unsigned short)((charNum >> 10) + 0xD800); |
buf[1] = (unsigned short)((charNum & 0x3FF) + 0xDC00); |
return 2; |
} |
return 0; |
} |
struct unknown_encoding { |
struct normal_encoding normal; |
CONVERTER convert; |
void *userData; |
unsigned short utf16[256]; |
char utf8[256][4]; |
}; |
#define AS_UNKNOWN_ENCODING(enc) ((const struct unknown_encoding *) (enc)) |
int |
XmlSizeOfUnknownEncoding(void) |
{ |
return sizeof(struct unknown_encoding); |
} |
static int PTRFASTCALL |
unknown_isName(const ENCODING *enc, const char *p) |
{ |
const struct unknown_encoding *uenc = AS_UNKNOWN_ENCODING(enc); |
int c = uenc->convert(uenc->userData, p); |
if (c & ~0xFFFF) |
return 0; |
return UCS2_GET_NAMING(namePages, c >> 8, c & 0xFF); |
} |
static int PTRFASTCALL |
unknown_isNmstrt(const ENCODING *enc, const char *p) |
{ |
const struct unknown_encoding *uenc = AS_UNKNOWN_ENCODING(enc); |
int c = uenc->convert(uenc->userData, p); |
if (c & ~0xFFFF) |
return 0; |
return UCS2_GET_NAMING(nmstrtPages, c >> 8, c & 0xFF); |
} |
static int PTRFASTCALL |
unknown_isInvalid(const ENCODING *enc, const char *p) |
{ |
const struct unknown_encoding *uenc = AS_UNKNOWN_ENCODING(enc); |
int c = uenc->convert(uenc->userData, p); |
return (c & ~0xFFFF) || checkCharRefNumber(c) < 0; |
} |
static void PTRCALL |
unknown_toUtf8(const ENCODING *enc, |
const char **fromP, const char *fromLim, |
char **toP, const char *toLim) |
{ |
const struct unknown_encoding *uenc = AS_UNKNOWN_ENCODING(enc); |
char buf[XML_UTF8_ENCODE_MAX]; |
for (;;) { |
const char *utf8; |
int n; |
if (*fromP == fromLim) |
break; |
utf8 = uenc->utf8[(unsigned char)**fromP]; |
n = *utf8++; |
if (n == 0) { |
int c = uenc->convert(uenc->userData, *fromP); |
n = XmlUtf8Encode(c, buf); |
if (n > toLim - *toP) |
break; |
utf8 = buf; |
*fromP += (AS_NORMAL_ENCODING(enc)->type[(unsigned char)**fromP] |
- (BT_LEAD2 - 2)); |
} |
else { |
if (n > toLim - *toP) |
break; |
(*fromP)++; |
} |
do { |
*(*toP)++ = *utf8++; |
} while (--n != 0); |
} |
} |
static void PTRCALL |
unknown_toUtf16(const ENCODING *enc, |
const char **fromP, const char *fromLim, |
unsigned short **toP, const unsigned short *toLim) |
{ |
const struct unknown_encoding *uenc = AS_UNKNOWN_ENCODING(enc); |
while (*fromP != fromLim && *toP != toLim) { |
unsigned short c = uenc->utf16[(unsigned char)**fromP]; |
if (c == 0) { |
c = (unsigned short) |
uenc->convert(uenc->userData, *fromP); |
*fromP += (AS_NORMAL_ENCODING(enc)->type[(unsigned char)**fromP] |
- (BT_LEAD2 - 2)); |
} |
else |
(*fromP)++; |
*(*toP)++ = c; |
} |
} |
ENCODING * |
XmlInitUnknownEncoding(void *mem, |
int *table, |
CONVERTER convert, |
void *userData) |
{ |
int i; |
struct unknown_encoding *e = (struct unknown_encoding *)mem; |
for (i = 0; i < (int)sizeof(struct normal_encoding); i++) |
((char *)mem)[i] = ((char *)&latin1_encoding)[i]; |
for (i = 0; i < 128; i++) |
if (latin1_encoding.type[i] != BT_OTHER |
&& latin1_encoding.type[i] != BT_NONXML |
&& table[i] != i) |
return 0; |
for (i = 0; i < 256; i++) { |
int c = table[i]; |
if (c == -1) { |
e->normal.type[i] = BT_MALFORM; |
/* This shouldn't really get used. */ |
e->utf16[i] = 0xFFFF; |
e->utf8[i][0] = 1; |
e->utf8[i][1] = 0; |
} |
else if (c < 0) { |
if (c < -4) |
return 0; |
e->normal.type[i] = (unsigned char)(BT_LEAD2 - (c + 2)); |
e->utf8[i][0] = 0; |
e->utf16[i] = 0; |
} |
else if (c < 0x80) { |
if (latin1_encoding.type[c] != BT_OTHER |
&& latin1_encoding.type[c] != BT_NONXML |
&& c != i) |
return 0; |
e->normal.type[i] = latin1_encoding.type[c]; |
e->utf8[i][0] = 1; |
e->utf8[i][1] = (char)c; |
e->utf16[i] = (unsigned short)(c == 0 ? 0xFFFF : c); |
} |
else if (checkCharRefNumber(c) < 0) { |
e->normal.type[i] = BT_NONXML; |
/* This shouldn't really get used. */ |
e->utf16[i] = 0xFFFF; |
e->utf8[i][0] = 1; |
e->utf8[i][1] = 0; |
} |
else { |
if (c > 0xFFFF) |
return 0; |
if (UCS2_GET_NAMING(nmstrtPages, c >> 8, c & 0xff)) |
e->normal.type[i] = BT_NMSTRT; |
else if (UCS2_GET_NAMING(namePages, c >> 8, c & 0xff)) |
e->normal.type[i] = BT_NAME; |
else |
e->normal.type[i] = BT_OTHER; |
e->utf8[i][0] = (char)XmlUtf8Encode(c, e->utf8[i] + 1); |
e->utf16[i] = (unsigned short)c; |
} |
} |
e->userData = userData; |
e->convert = convert; |
if (convert) { |
e->normal.isName2 = unknown_isName; |
e->normal.isName3 = unknown_isName; |
e->normal.isName4 = unknown_isName; |
e->normal.isNmstrt2 = unknown_isNmstrt; |
e->normal.isNmstrt3 = unknown_isNmstrt; |
e->normal.isNmstrt4 = unknown_isNmstrt; |
e->normal.isInvalid2 = unknown_isInvalid; |
e->normal.isInvalid3 = unknown_isInvalid; |
e->normal.isInvalid4 = unknown_isInvalid; |
} |
e->normal.enc.utf8Convert = unknown_toUtf8; |
e->normal.enc.utf16Convert = unknown_toUtf16; |
return &(e->normal.enc); |
} |
/* If this enumeration is changed, getEncodingIndex and encodings |
must also be changed. */ |
enum { |
UNKNOWN_ENC = -1, |
ISO_8859_1_ENC = 0, |
US_ASCII_ENC, |
UTF_8_ENC, |
UTF_16_ENC, |
UTF_16BE_ENC, |
UTF_16LE_ENC, |
/* must match encodingNames up to here */ |
NO_ENC |
}; |
static const char KW_ISO_8859_1[] = { |
ASCII_I, ASCII_S, ASCII_O, ASCII_MINUS, ASCII_8, ASCII_8, ASCII_5, ASCII_9, |
ASCII_MINUS, ASCII_1, '\0' |
}; |
static const char KW_US_ASCII[] = { |
ASCII_U, ASCII_S, ASCII_MINUS, ASCII_A, ASCII_S, ASCII_C, ASCII_I, ASCII_I, |
'\0' |
}; |
static const char KW_UTF_8[] = { |
ASCII_U, ASCII_T, ASCII_F, ASCII_MINUS, ASCII_8, '\0' |
}; |
static const char KW_UTF_16[] = { |
ASCII_U, ASCII_T, ASCII_F, ASCII_MINUS, ASCII_1, ASCII_6, '\0' |
}; |
static const char KW_UTF_16BE[] = { |
ASCII_U, ASCII_T, ASCII_F, ASCII_MINUS, ASCII_1, ASCII_6, ASCII_B, ASCII_E, |
'\0' |
}; |
static const char KW_UTF_16LE[] = { |
ASCII_U, ASCII_T, ASCII_F, ASCII_MINUS, ASCII_1, ASCII_6, ASCII_L, ASCII_E, |
'\0' |
}; |
static int FASTCALL |
getEncodingIndex(const char *name) |
{ |
static const char * const encodingNames[] = { |
KW_ISO_8859_1, |
KW_US_ASCII, |
KW_UTF_8, |
KW_UTF_16, |
KW_UTF_16BE, |
KW_UTF_16LE, |
}; |
int i; |
if (name == NULL) |
return NO_ENC; |
for (i = 0; i < (int)(sizeof(encodingNames)/sizeof(encodingNames[0])); i++) |
if (streqci(name, encodingNames[i])) |
return i; |
return UNKNOWN_ENC; |
} |
/* For binary compatibility, we store the index of the encoding |
specified at initialization in the isUtf16 member. |
*/ |
#define INIT_ENC_INDEX(enc) ((int)(enc)->initEnc.isUtf16) |
#define SET_INIT_ENC_INDEX(enc, i) ((enc)->initEnc.isUtf16 = (char)i) |
/* This is what detects the encoding. encodingTable maps from |
encoding indices to encodings; INIT_ENC_INDEX(enc) is the index of |
the external (protocol) specified encoding; state is |
XML_CONTENT_STATE if we're parsing an external text entity, and |
XML_PROLOG_STATE otherwise. |
*/ |
static int |
initScan(const ENCODING * const *encodingTable, |
const INIT_ENCODING *enc, |
int state, |
const char *ptr, |
const char *end, |
const char **nextTokPtr) |
{ |
const ENCODING **encPtr; |
if (ptr == end) |
return XML_TOK_NONE; |
encPtr = enc->encPtr; |
if (ptr + 1 == end) { |
/* only a single byte available for auto-detection */ |
#ifndef XML_DTD /* FIXME */ |
/* a well-formed document entity must have more than one byte */ |
if (state != XML_CONTENT_STATE) |
return XML_TOK_PARTIAL; |
#endif |
/* so we're parsing an external text entity... */ |
/* if UTF-16 was externally specified, then we need at least 2 bytes */ |
switch (INIT_ENC_INDEX(enc)) { |
case UTF_16_ENC: |
case UTF_16LE_ENC: |
case UTF_16BE_ENC: |
return XML_TOK_PARTIAL; |
} |
switch ((unsigned char)*ptr) { |
case 0xFE: |
case 0xFF: |
case 0xEF: /* possibly first byte of UTF-8 BOM */ |
if (INIT_ENC_INDEX(enc) == ISO_8859_1_ENC |
&& state == XML_CONTENT_STATE) |
break; |
/* fall through */ |
case 0x00: |
case 0x3C: |
return XML_TOK_PARTIAL; |
} |
} |
else { |
switch (((unsigned char)ptr[0] << 8) | (unsigned char)ptr[1]) { |
case 0xFEFF: |
if (INIT_ENC_INDEX(enc) == ISO_8859_1_ENC |
&& state == XML_CONTENT_STATE) |
break; |
*nextTokPtr = ptr + 2; |
*encPtr = encodingTable[UTF_16BE_ENC]; |
return XML_TOK_BOM; |
/* 00 3C is handled in the default case */ |
case 0x3C00: |
if ((INIT_ENC_INDEX(enc) == UTF_16BE_ENC |
|| INIT_ENC_INDEX(enc) == UTF_16_ENC) |
&& state == XML_CONTENT_STATE) |
break; |
*encPtr = encodingTable[UTF_16LE_ENC]; |
return XmlTok(*encPtr, state, ptr, end, nextTokPtr); |
case 0xFFFE: |
if (INIT_ENC_INDEX(enc) == ISO_8859_1_ENC |
&& state == XML_CONTENT_STATE) |
break; |
*nextTokPtr = ptr + 2; |
*encPtr = encodingTable[UTF_16LE_ENC]; |
return XML_TOK_BOM; |
case 0xEFBB: |
/* Maybe a UTF-8 BOM (EF BB BF) */ |
/* If there's an explicitly specified (external) encoding |
of ISO-8859-1 or some flavour of UTF-16 |
and this is an external text entity, |
don't look for the BOM, |
because it might be a legal data. |
*/ |
if (state == XML_CONTENT_STATE) { |
int e = INIT_ENC_INDEX(enc); |
if (e == ISO_8859_1_ENC || e == UTF_16BE_ENC |
|| e == UTF_16LE_ENC || e == UTF_16_ENC) |
break; |
} |
if (ptr + 2 == end) |
return XML_TOK_PARTIAL; |
if ((unsigned char)ptr[2] == 0xBF) { |
*nextTokPtr = ptr + 3; |
*encPtr = encodingTable[UTF_8_ENC]; |
return XML_TOK_BOM; |
} |
break; |
default: |
if (ptr[0] == '\0') { |
/* 0 isn't a legal data character. Furthermore a document |
entity can only start with ASCII characters. So the only |
way this can fail to be big-endian UTF-16 if it it's an |
external parsed general entity that's labelled as |
UTF-16LE. |
*/ |
if (state == XML_CONTENT_STATE && INIT_ENC_INDEX(enc) == UTF_16LE_ENC) |
break; |
*encPtr = encodingTable[UTF_16BE_ENC]; |
return XmlTok(*encPtr, state, ptr, end, nextTokPtr); |
} |
else if (ptr[1] == '\0') { |
/* We could recover here in the case: |
- parsing an external entity |
- second byte is 0 |
- no externally specified encoding |
- no encoding declaration |
by assuming UTF-16LE. But we don't, because this would mean when |
presented just with a single byte, we couldn't reliably determine |
whether we needed further bytes. |
*/ |
if (state == XML_CONTENT_STATE) |
break; |
*encPtr = encodingTable[UTF_16LE_ENC]; |
return XmlTok(*encPtr, state, ptr, end, nextTokPtr); |
} |
break; |
} |
} |
*encPtr = encodingTable[INIT_ENC_INDEX(enc)]; |
return XmlTok(*encPtr, state, ptr, end, nextTokPtr); |
} |
#define NS(x) x |
#define ns(x) x |
#define XML_TOK_NS_C |
#include "xmltok_ns.c" |
#undef XML_TOK_NS_C |
#undef NS |
#undef ns |
#ifdef XML_NS |
#define NS(x) x ## NS |
#define ns(x) x ## _ns |
#define XML_TOK_NS_C |
#include "xmltok_ns.c" |
#undef XML_TOK_NS_C |
#undef NS |
#undef ns |
ENCODING * |
XmlInitUnknownEncodingNS(void *mem, |
int *table, |
CONVERTER convert, |
void *userData) |
{ |
ENCODING *enc = XmlInitUnknownEncoding(mem, table, convert, userData); |
if (enc) |
((struct normal_encoding *)enc)->type[ASCII_COLON] = BT_COLON; |
return enc; |
} |
#endif /* XML_NS */ |
/contrib/sdk/sources/expat/lib/xmltok.h |
---|
0,0 → 1,316 |
/* Copyright (c) 1998, 1999 Thai Open Source Software Center Ltd |
See the file COPYING for copying permission. |
*/ |
#ifndef XmlTok_INCLUDED |
#define XmlTok_INCLUDED 1 |
#ifdef __cplusplus |
extern "C" { |
#endif |
/* The following token may be returned by XmlContentTok */ |
#define XML_TOK_TRAILING_RSQB -5 /* ] or ]] at the end of the scan; might be |
start of illegal ]]> sequence */ |
/* The following tokens may be returned by both XmlPrologTok and |
XmlContentTok. |
*/ |
#define XML_TOK_NONE -4 /* The string to be scanned is empty */ |
#define XML_TOK_TRAILING_CR -3 /* A CR at the end of the scan; |
might be part of CRLF sequence */ |
#define XML_TOK_PARTIAL_CHAR -2 /* only part of a multibyte sequence */ |
#define XML_TOK_PARTIAL -1 /* only part of a token */ |
#define XML_TOK_INVALID 0 |
/* The following tokens are returned by XmlContentTok; some are also |
returned by XmlAttributeValueTok, XmlEntityTok, XmlCdataSectionTok. |
*/ |
#define XML_TOK_START_TAG_WITH_ATTS 1 |
#define XML_TOK_START_TAG_NO_ATTS 2 |
#define XML_TOK_EMPTY_ELEMENT_WITH_ATTS 3 /* empty element tag <e/> */ |
#define XML_TOK_EMPTY_ELEMENT_NO_ATTS 4 |
#define XML_TOK_END_TAG 5 |
#define XML_TOK_DATA_CHARS 6 |
#define XML_TOK_DATA_NEWLINE 7 |
#define XML_TOK_CDATA_SECT_OPEN 8 |
#define XML_TOK_ENTITY_REF 9 |
#define XML_TOK_CHAR_REF 10 /* numeric character reference */ |
/* The following tokens may be returned by both XmlPrologTok and |
XmlContentTok. |
*/ |
#define XML_TOK_PI 11 /* processing instruction */ |
#define XML_TOK_XML_DECL 12 /* XML decl or text decl */ |
#define XML_TOK_COMMENT 13 |
#define XML_TOK_BOM 14 /* Byte order mark */ |
/* The following tokens are returned only by XmlPrologTok */ |
#define XML_TOK_PROLOG_S 15 |
#define XML_TOK_DECL_OPEN 16 /* <!foo */ |
#define XML_TOK_DECL_CLOSE 17 /* > */ |
#define XML_TOK_NAME 18 |
#define XML_TOK_NMTOKEN 19 |
#define XML_TOK_POUND_NAME 20 /* #name */ |
#define XML_TOK_OR 21 /* | */ |
#define XML_TOK_PERCENT 22 |
#define XML_TOK_OPEN_PAREN 23 |
#define XML_TOK_CLOSE_PAREN 24 |
#define XML_TOK_OPEN_BRACKET 25 |
#define XML_TOK_CLOSE_BRACKET 26 |
#define XML_TOK_LITERAL 27 |
#define XML_TOK_PARAM_ENTITY_REF 28 |
#define XML_TOK_INSTANCE_START 29 |
/* The following occur only in element type declarations */ |
#define XML_TOK_NAME_QUESTION 30 /* name? */ |
#define XML_TOK_NAME_ASTERISK 31 /* name* */ |
#define XML_TOK_NAME_PLUS 32 /* name+ */ |
#define XML_TOK_COND_SECT_OPEN 33 /* <![ */ |
#define XML_TOK_COND_SECT_CLOSE 34 /* ]]> */ |
#define XML_TOK_CLOSE_PAREN_QUESTION 35 /* )? */ |
#define XML_TOK_CLOSE_PAREN_ASTERISK 36 /* )* */ |
#define XML_TOK_CLOSE_PAREN_PLUS 37 /* )+ */ |
#define XML_TOK_COMMA 38 |
/* The following token is returned only by XmlAttributeValueTok */ |
#define XML_TOK_ATTRIBUTE_VALUE_S 39 |
/* The following token is returned only by XmlCdataSectionTok */ |
#define XML_TOK_CDATA_SECT_CLOSE 40 |
/* With namespace processing this is returned by XmlPrologTok for a |
name with a colon. |
*/ |
#define XML_TOK_PREFIXED_NAME 41 |
#ifdef XML_DTD |
#define XML_TOK_IGNORE_SECT 42 |
#endif /* XML_DTD */ |
#ifdef XML_DTD |
#define XML_N_STATES 4 |
#else /* not XML_DTD */ |
#define XML_N_STATES 3 |
#endif /* not XML_DTD */ |
#define XML_PROLOG_STATE 0 |
#define XML_CONTENT_STATE 1 |
#define XML_CDATA_SECTION_STATE 2 |
#ifdef XML_DTD |
#define XML_IGNORE_SECTION_STATE 3 |
#endif /* XML_DTD */ |
#define XML_N_LITERAL_TYPES 2 |
#define XML_ATTRIBUTE_VALUE_LITERAL 0 |
#define XML_ENTITY_VALUE_LITERAL 1 |
/* The size of the buffer passed to XmlUtf8Encode must be at least this. */ |
#define XML_UTF8_ENCODE_MAX 4 |
/* The size of the buffer passed to XmlUtf16Encode must be at least this. */ |
#define XML_UTF16_ENCODE_MAX 2 |
typedef struct position { |
/* first line and first column are 0 not 1 */ |
XML_Size lineNumber; |
XML_Size columnNumber; |
} POSITION; |
typedef struct { |
const char *name; |
const char *valuePtr; |
const char *valueEnd; |
char normalized; |
} ATTRIBUTE; |
struct encoding; |
typedef struct encoding ENCODING; |
typedef int (PTRCALL *SCANNER)(const ENCODING *, |
const char *, |
const char *, |
const char **); |
struct encoding { |
SCANNER scanners[XML_N_STATES]; |
SCANNER literalScanners[XML_N_LITERAL_TYPES]; |
int (PTRCALL *sameName)(const ENCODING *, |
const char *, |
const char *); |
int (PTRCALL *nameMatchesAscii)(const ENCODING *, |
const char *, |
const char *, |
const char *); |
int (PTRFASTCALL *nameLength)(const ENCODING *, const char *); |
const char *(PTRFASTCALL *skipS)(const ENCODING *, const char *); |
int (PTRCALL *getAtts)(const ENCODING *enc, |
const char *ptr, |
int attsMax, |
ATTRIBUTE *atts); |
int (PTRFASTCALL *charRefNumber)(const ENCODING *enc, const char *ptr); |
int (PTRCALL *predefinedEntityName)(const ENCODING *, |
const char *, |
const char *); |
void (PTRCALL *updatePosition)(const ENCODING *, |
const char *ptr, |
const char *end, |
POSITION *); |
int (PTRCALL *isPublicId)(const ENCODING *enc, |
const char *ptr, |
const char *end, |
const char **badPtr); |
void (PTRCALL *utf8Convert)(const ENCODING *enc, |
const char **fromP, |
const char *fromLim, |
char **toP, |
const char *toLim); |
void (PTRCALL *utf16Convert)(const ENCODING *enc, |
const char **fromP, |
const char *fromLim, |
unsigned short **toP, |
const unsigned short *toLim); |
int minBytesPerChar; |
char isUtf8; |
char isUtf16; |
}; |
/* Scan the string starting at ptr until the end of the next complete |
token, but do not scan past eptr. Return an integer giving the |
type of token. |
Return XML_TOK_NONE when ptr == eptr; nextTokPtr will not be set. |
Return XML_TOK_PARTIAL when the string does not contain a complete |
token; nextTokPtr will not be set. |
Return XML_TOK_INVALID when the string does not start a valid |
token; nextTokPtr will be set to point to the character which made |
the token invalid. |
Otherwise the string starts with a valid token; nextTokPtr will be |
set to point to the character following the end of that token. |
Each data character counts as a single token, but adjacent data |
characters may be returned together. Similarly for characters in |
the prolog outside literals, comments and processing instructions. |
*/ |
#define XmlTok(enc, state, ptr, end, nextTokPtr) \ |
(((enc)->scanners[state])(enc, ptr, end, nextTokPtr)) |
#define XmlPrologTok(enc, ptr, end, nextTokPtr) \ |
XmlTok(enc, XML_PROLOG_STATE, ptr, end, nextTokPtr) |
#define XmlContentTok(enc, ptr, end, nextTokPtr) \ |
XmlTok(enc, XML_CONTENT_STATE, ptr, end, nextTokPtr) |
#define XmlCdataSectionTok(enc, ptr, end, nextTokPtr) \ |
XmlTok(enc, XML_CDATA_SECTION_STATE, ptr, end, nextTokPtr) |
#ifdef XML_DTD |
#define XmlIgnoreSectionTok(enc, ptr, end, nextTokPtr) \ |
XmlTok(enc, XML_IGNORE_SECTION_STATE, ptr, end, nextTokPtr) |
#endif /* XML_DTD */ |
/* This is used for performing a 2nd-level tokenization on the content |
of a literal that has already been returned by XmlTok. |
*/ |
#define XmlLiteralTok(enc, literalType, ptr, end, nextTokPtr) \ |
(((enc)->literalScanners[literalType])(enc, ptr, end, nextTokPtr)) |
#define XmlAttributeValueTok(enc, ptr, end, nextTokPtr) \ |
XmlLiteralTok(enc, XML_ATTRIBUTE_VALUE_LITERAL, ptr, end, nextTokPtr) |
#define XmlEntityValueTok(enc, ptr, end, nextTokPtr) \ |
XmlLiteralTok(enc, XML_ENTITY_VALUE_LITERAL, ptr, end, nextTokPtr) |
#define XmlSameName(enc, ptr1, ptr2) (((enc)->sameName)(enc, ptr1, ptr2)) |
#define XmlNameMatchesAscii(enc, ptr1, end1, ptr2) \ |
(((enc)->nameMatchesAscii)(enc, ptr1, end1, ptr2)) |
#define XmlNameLength(enc, ptr) \ |
(((enc)->nameLength)(enc, ptr)) |
#define XmlSkipS(enc, ptr) \ |
(((enc)->skipS)(enc, ptr)) |
#define XmlGetAttributes(enc, ptr, attsMax, atts) \ |
(((enc)->getAtts)(enc, ptr, attsMax, atts)) |
#define XmlCharRefNumber(enc, ptr) \ |
(((enc)->charRefNumber)(enc, ptr)) |
#define XmlPredefinedEntityName(enc, ptr, end) \ |
(((enc)->predefinedEntityName)(enc, ptr, end)) |
#define XmlUpdatePosition(enc, ptr, end, pos) \ |
(((enc)->updatePosition)(enc, ptr, end, pos)) |
#define XmlIsPublicId(enc, ptr, end, badPtr) \ |
(((enc)->isPublicId)(enc, ptr, end, badPtr)) |
#define XmlUtf8Convert(enc, fromP, fromLim, toP, toLim) \ |
(((enc)->utf8Convert)(enc, fromP, fromLim, toP, toLim)) |
#define XmlUtf16Convert(enc, fromP, fromLim, toP, toLim) \ |
(((enc)->utf16Convert)(enc, fromP, fromLim, toP, toLim)) |
typedef struct { |
ENCODING initEnc; |
const ENCODING **encPtr; |
} INIT_ENCODING; |
int XmlParseXmlDecl(int isGeneralTextEntity, |
const ENCODING *enc, |
const char *ptr, |
const char *end, |
const char **badPtr, |
const char **versionPtr, |
const char **versionEndPtr, |
const char **encodingNamePtr, |
const ENCODING **namedEncodingPtr, |
int *standalonePtr); |
int XmlInitEncoding(INIT_ENCODING *, const ENCODING **, const char *name); |
const ENCODING *XmlGetUtf8InternalEncoding(void); |
const ENCODING *XmlGetUtf16InternalEncoding(void); |
int FASTCALL XmlUtf8Encode(int charNumber, char *buf); |
int FASTCALL XmlUtf16Encode(int charNumber, unsigned short *buf); |
int XmlSizeOfUnknownEncoding(void); |
typedef int (XMLCALL *CONVERTER) (void *userData, const char *p); |
ENCODING * |
XmlInitUnknownEncoding(void *mem, |
int *table, |
CONVERTER convert, |
void *userData); |
int XmlParseXmlDeclNS(int isGeneralTextEntity, |
const ENCODING *enc, |
const char *ptr, |
const char *end, |
const char **badPtr, |
const char **versionPtr, |
const char **versionEndPtr, |
const char **encodingNamePtr, |
const ENCODING **namedEncodingPtr, |
int *standalonePtr); |
int XmlInitEncodingNS(INIT_ENCODING *, const ENCODING **, const char *name); |
const ENCODING *XmlGetUtf8InternalEncodingNS(void); |
const ENCODING *XmlGetUtf16InternalEncodingNS(void); |
ENCODING * |
XmlInitUnknownEncodingNS(void *mem, |
int *table, |
CONVERTER convert, |
void *userData); |
#ifdef __cplusplus |
} |
#endif |
#endif /* not XmlTok_INCLUDED */ |
/contrib/sdk/sources/expat/lib/xmltok_impl.c |
---|
0,0 → 1,1783 |
/* Copyright (c) 1998, 1999 Thai Open Source Software Center Ltd |
See the file COPYING for copying permission. |
*/ |
/* This file is included! */ |
#ifdef XML_TOK_IMPL_C |
#ifndef IS_INVALID_CHAR |
#define IS_INVALID_CHAR(enc, ptr, n) (0) |
#endif |
#define INVALID_LEAD_CASE(n, ptr, nextTokPtr) \ |
case BT_LEAD ## n: \ |
if (end - ptr < n) \ |
return XML_TOK_PARTIAL_CHAR; \ |
if (IS_INVALID_CHAR(enc, ptr, n)) { \ |
*(nextTokPtr) = (ptr); \ |
return XML_TOK_INVALID; \ |
} \ |
ptr += n; \ |
break; |
#define INVALID_CASES(ptr, nextTokPtr) \ |
INVALID_LEAD_CASE(2, ptr, nextTokPtr) \ |
INVALID_LEAD_CASE(3, ptr, nextTokPtr) \ |
INVALID_LEAD_CASE(4, ptr, nextTokPtr) \ |
case BT_NONXML: \ |
case BT_MALFORM: \ |
case BT_TRAIL: \ |
*(nextTokPtr) = (ptr); \ |
return XML_TOK_INVALID; |
#define CHECK_NAME_CASE(n, enc, ptr, end, nextTokPtr) \ |
case BT_LEAD ## n: \ |
if (end - ptr < n) \ |
return XML_TOK_PARTIAL_CHAR; \ |
if (!IS_NAME_CHAR(enc, ptr, n)) { \ |
*nextTokPtr = ptr; \ |
return XML_TOK_INVALID; \ |
} \ |
ptr += n; \ |
break; |
#define CHECK_NAME_CASES(enc, ptr, end, nextTokPtr) \ |
case BT_NONASCII: \ |
if (!IS_NAME_CHAR_MINBPC(enc, ptr)) { \ |
*nextTokPtr = ptr; \ |
return XML_TOK_INVALID; \ |
} \ |
case BT_NMSTRT: \ |
case BT_HEX: \ |
case BT_DIGIT: \ |
case BT_NAME: \ |
case BT_MINUS: \ |
ptr += MINBPC(enc); \ |
break; \ |
CHECK_NAME_CASE(2, enc, ptr, end, nextTokPtr) \ |
CHECK_NAME_CASE(3, enc, ptr, end, nextTokPtr) \ |
CHECK_NAME_CASE(4, enc, ptr, end, nextTokPtr) |
#define CHECK_NMSTRT_CASE(n, enc, ptr, end, nextTokPtr) \ |
case BT_LEAD ## n: \ |
if (end - ptr < n) \ |
return XML_TOK_PARTIAL_CHAR; \ |
if (!IS_NMSTRT_CHAR(enc, ptr, n)) { \ |
*nextTokPtr = ptr; \ |
return XML_TOK_INVALID; \ |
} \ |
ptr += n; \ |
break; |
#define CHECK_NMSTRT_CASES(enc, ptr, end, nextTokPtr) \ |
case BT_NONASCII: \ |
if (!IS_NMSTRT_CHAR_MINBPC(enc, ptr)) { \ |
*nextTokPtr = ptr; \ |
return XML_TOK_INVALID; \ |
} \ |
case BT_NMSTRT: \ |
case BT_HEX: \ |
ptr += MINBPC(enc); \ |
break; \ |
CHECK_NMSTRT_CASE(2, enc, ptr, end, nextTokPtr) \ |
CHECK_NMSTRT_CASE(3, enc, ptr, end, nextTokPtr) \ |
CHECK_NMSTRT_CASE(4, enc, ptr, end, nextTokPtr) |
#ifndef PREFIX |
#define PREFIX(ident) ident |
#endif |
/* ptr points to character following "<!-" */ |
static int PTRCALL |
PREFIX(scanComment)(const ENCODING *enc, const char *ptr, |
const char *end, const char **nextTokPtr) |
{ |
if (ptr != end) { |
if (!CHAR_MATCHES(enc, ptr, ASCII_MINUS)) { |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
ptr += MINBPC(enc); |
while (ptr != end) { |
switch (BYTE_TYPE(enc, ptr)) { |
INVALID_CASES(ptr, nextTokPtr) |
case BT_MINUS: |
if ((ptr += MINBPC(enc)) == end) |
return XML_TOK_PARTIAL; |
if (CHAR_MATCHES(enc, ptr, ASCII_MINUS)) { |
if ((ptr += MINBPC(enc)) == end) |
return XML_TOK_PARTIAL; |
if (!CHAR_MATCHES(enc, ptr, ASCII_GT)) { |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_COMMENT; |
} |
break; |
default: |
ptr += MINBPC(enc); |
break; |
} |
} |
} |
return XML_TOK_PARTIAL; |
} |
/* ptr points to character following "<!" */ |
static int PTRCALL |
PREFIX(scanDecl)(const ENCODING *enc, const char *ptr, |
const char *end, const char **nextTokPtr) |
{ |
if (ptr == end) |
return XML_TOK_PARTIAL; |
switch (BYTE_TYPE(enc, ptr)) { |
case BT_MINUS: |
return PREFIX(scanComment)(enc, ptr + MINBPC(enc), end, nextTokPtr); |
case BT_LSQB: |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_COND_SECT_OPEN; |
case BT_NMSTRT: |
case BT_HEX: |
ptr += MINBPC(enc); |
break; |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
while (ptr != end) { |
switch (BYTE_TYPE(enc, ptr)) { |
case BT_PERCNT: |
if (ptr + MINBPC(enc) == end) |
return XML_TOK_PARTIAL; |
/* don't allow <!ENTITY% foo "whatever"> */ |
switch (BYTE_TYPE(enc, ptr + MINBPC(enc))) { |
case BT_S: case BT_CR: case BT_LF: case BT_PERCNT: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
/* fall through */ |
case BT_S: case BT_CR: case BT_LF: |
*nextTokPtr = ptr; |
return XML_TOK_DECL_OPEN; |
case BT_NMSTRT: |
case BT_HEX: |
ptr += MINBPC(enc); |
break; |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
} |
return XML_TOK_PARTIAL; |
} |
static int PTRCALL |
PREFIX(checkPiTarget)(const ENCODING *enc, const char *ptr, |
const char *end, int *tokPtr) |
{ |
int upper = 0; |
*tokPtr = XML_TOK_PI; |
if (end - ptr != MINBPC(enc)*3) |
return 1; |
switch (BYTE_TO_ASCII(enc, ptr)) { |
case ASCII_x: |
break; |
case ASCII_X: |
upper = 1; |
break; |
default: |
return 1; |
} |
ptr += MINBPC(enc); |
switch (BYTE_TO_ASCII(enc, ptr)) { |
case ASCII_m: |
break; |
case ASCII_M: |
upper = 1; |
break; |
default: |
return 1; |
} |
ptr += MINBPC(enc); |
switch (BYTE_TO_ASCII(enc, ptr)) { |
case ASCII_l: |
break; |
case ASCII_L: |
upper = 1; |
break; |
default: |
return 1; |
} |
if (upper) |
return 0; |
*tokPtr = XML_TOK_XML_DECL; |
return 1; |
} |
/* ptr points to character following "<?" */ |
static int PTRCALL |
PREFIX(scanPi)(const ENCODING *enc, const char *ptr, |
const char *end, const char **nextTokPtr) |
{ |
int tok; |
const char *target = ptr; |
if (ptr == end) |
return XML_TOK_PARTIAL; |
switch (BYTE_TYPE(enc, ptr)) { |
CHECK_NMSTRT_CASES(enc, ptr, end, nextTokPtr) |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
while (ptr != end) { |
switch (BYTE_TYPE(enc, ptr)) { |
CHECK_NAME_CASES(enc, ptr, end, nextTokPtr) |
case BT_S: case BT_CR: case BT_LF: |
if (!PREFIX(checkPiTarget)(enc, target, ptr, &tok)) { |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
ptr += MINBPC(enc); |
while (ptr != end) { |
switch (BYTE_TYPE(enc, ptr)) { |
INVALID_CASES(ptr, nextTokPtr) |
case BT_QUEST: |
ptr += MINBPC(enc); |
if (ptr == end) |
return XML_TOK_PARTIAL; |
if (CHAR_MATCHES(enc, ptr, ASCII_GT)) { |
*nextTokPtr = ptr + MINBPC(enc); |
return tok; |
} |
break; |
default: |
ptr += MINBPC(enc); |
break; |
} |
} |
return XML_TOK_PARTIAL; |
case BT_QUEST: |
if (!PREFIX(checkPiTarget)(enc, target, ptr, &tok)) { |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
ptr += MINBPC(enc); |
if (ptr == end) |
return XML_TOK_PARTIAL; |
if (CHAR_MATCHES(enc, ptr, ASCII_GT)) { |
*nextTokPtr = ptr + MINBPC(enc); |
return tok; |
} |
/* fall through */ |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
} |
return XML_TOK_PARTIAL; |
} |
static int PTRCALL |
PREFIX(scanCdataSection)(const ENCODING *enc, const char *ptr, |
const char *end, const char **nextTokPtr) |
{ |
static const char CDATA_LSQB[] = { ASCII_C, ASCII_D, ASCII_A, |
ASCII_T, ASCII_A, ASCII_LSQB }; |
int i; |
/* CDATA[ */ |
if (end - ptr < 6 * MINBPC(enc)) |
return XML_TOK_PARTIAL; |
for (i = 0; i < 6; i++, ptr += MINBPC(enc)) { |
if (!CHAR_MATCHES(enc, ptr, CDATA_LSQB[i])) { |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
} |
*nextTokPtr = ptr; |
return XML_TOK_CDATA_SECT_OPEN; |
} |
static int PTRCALL |
PREFIX(cdataSectionTok)(const ENCODING *enc, const char *ptr, |
const char *end, const char **nextTokPtr) |
{ |
if (ptr == end) |
return XML_TOK_NONE; |
if (MINBPC(enc) > 1) { |
size_t n = end - ptr; |
if (n & (MINBPC(enc) - 1)) { |
n &= ~(MINBPC(enc) - 1); |
if (n == 0) |
return XML_TOK_PARTIAL; |
end = ptr + n; |
} |
} |
switch (BYTE_TYPE(enc, ptr)) { |
case BT_RSQB: |
ptr += MINBPC(enc); |
if (ptr == end) |
return XML_TOK_PARTIAL; |
if (!CHAR_MATCHES(enc, ptr, ASCII_RSQB)) |
break; |
ptr += MINBPC(enc); |
if (ptr == end) |
return XML_TOK_PARTIAL; |
if (!CHAR_MATCHES(enc, ptr, ASCII_GT)) { |
ptr -= MINBPC(enc); |
break; |
} |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_CDATA_SECT_CLOSE; |
case BT_CR: |
ptr += MINBPC(enc); |
if (ptr == end) |
return XML_TOK_PARTIAL; |
if (BYTE_TYPE(enc, ptr) == BT_LF) |
ptr += MINBPC(enc); |
*nextTokPtr = ptr; |
return XML_TOK_DATA_NEWLINE; |
case BT_LF: |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_DATA_NEWLINE; |
INVALID_CASES(ptr, nextTokPtr) |
default: |
ptr += MINBPC(enc); |
break; |
} |
while (ptr != end) { |
switch (BYTE_TYPE(enc, ptr)) { |
#define LEAD_CASE(n) \ |
case BT_LEAD ## n: \ |
if (end - ptr < n || IS_INVALID_CHAR(enc, ptr, n)) { \ |
*nextTokPtr = ptr; \ |
return XML_TOK_DATA_CHARS; \ |
} \ |
ptr += n; \ |
break; |
LEAD_CASE(2) LEAD_CASE(3) LEAD_CASE(4) |
#undef LEAD_CASE |
case BT_NONXML: |
case BT_MALFORM: |
case BT_TRAIL: |
case BT_CR: |
case BT_LF: |
case BT_RSQB: |
*nextTokPtr = ptr; |
return XML_TOK_DATA_CHARS; |
default: |
ptr += MINBPC(enc); |
break; |
} |
} |
*nextTokPtr = ptr; |
return XML_TOK_DATA_CHARS; |
} |
/* ptr points to character following "</" */ |
static int PTRCALL |
PREFIX(scanEndTag)(const ENCODING *enc, const char *ptr, |
const char *end, const char **nextTokPtr) |
{ |
if (ptr == end) |
return XML_TOK_PARTIAL; |
switch (BYTE_TYPE(enc, ptr)) { |
CHECK_NMSTRT_CASES(enc, ptr, end, nextTokPtr) |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
while (ptr != end) { |
switch (BYTE_TYPE(enc, ptr)) { |
CHECK_NAME_CASES(enc, ptr, end, nextTokPtr) |
case BT_S: case BT_CR: case BT_LF: |
for (ptr += MINBPC(enc); ptr != end; ptr += MINBPC(enc)) { |
switch (BYTE_TYPE(enc, ptr)) { |
case BT_S: case BT_CR: case BT_LF: |
break; |
case BT_GT: |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_END_TAG; |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
} |
return XML_TOK_PARTIAL; |
#ifdef XML_NS |
case BT_COLON: |
/* no need to check qname syntax here, |
since end-tag must match exactly */ |
ptr += MINBPC(enc); |
break; |
#endif |
case BT_GT: |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_END_TAG; |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
} |
return XML_TOK_PARTIAL; |
} |
/* ptr points to character following "&#X" */ |
static int PTRCALL |
PREFIX(scanHexCharRef)(const ENCODING *enc, const char *ptr, |
const char *end, const char **nextTokPtr) |
{ |
if (ptr != end) { |
switch (BYTE_TYPE(enc, ptr)) { |
case BT_DIGIT: |
case BT_HEX: |
break; |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
for (ptr += MINBPC(enc); ptr != end; ptr += MINBPC(enc)) { |
switch (BYTE_TYPE(enc, ptr)) { |
case BT_DIGIT: |
case BT_HEX: |
break; |
case BT_SEMI: |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_CHAR_REF; |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
} |
} |
return XML_TOK_PARTIAL; |
} |
/* ptr points to character following "&#" */ |
static int PTRCALL |
PREFIX(scanCharRef)(const ENCODING *enc, const char *ptr, |
const char *end, const char **nextTokPtr) |
{ |
if (ptr != end) { |
if (CHAR_MATCHES(enc, ptr, ASCII_x)) |
return PREFIX(scanHexCharRef)(enc, ptr + MINBPC(enc), end, nextTokPtr); |
switch (BYTE_TYPE(enc, ptr)) { |
case BT_DIGIT: |
break; |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
for (ptr += MINBPC(enc); ptr != end; ptr += MINBPC(enc)) { |
switch (BYTE_TYPE(enc, ptr)) { |
case BT_DIGIT: |
break; |
case BT_SEMI: |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_CHAR_REF; |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
} |
} |
return XML_TOK_PARTIAL; |
} |
/* ptr points to character following "&" */ |
static int PTRCALL |
PREFIX(scanRef)(const ENCODING *enc, const char *ptr, const char *end, |
const char **nextTokPtr) |
{ |
if (ptr == end) |
return XML_TOK_PARTIAL; |
switch (BYTE_TYPE(enc, ptr)) { |
CHECK_NMSTRT_CASES(enc, ptr, end, nextTokPtr) |
case BT_NUM: |
return PREFIX(scanCharRef)(enc, ptr + MINBPC(enc), end, nextTokPtr); |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
while (ptr != end) { |
switch (BYTE_TYPE(enc, ptr)) { |
CHECK_NAME_CASES(enc, ptr, end, nextTokPtr) |
case BT_SEMI: |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_ENTITY_REF; |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
} |
return XML_TOK_PARTIAL; |
} |
/* ptr points to character following first character of attribute name */ |
static int PTRCALL |
PREFIX(scanAtts)(const ENCODING *enc, const char *ptr, const char *end, |
const char **nextTokPtr) |
{ |
#ifdef XML_NS |
int hadColon = 0; |
#endif |
while (ptr != end) { |
switch (BYTE_TYPE(enc, ptr)) { |
CHECK_NAME_CASES(enc, ptr, end, nextTokPtr) |
#ifdef XML_NS |
case BT_COLON: |
if (hadColon) { |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
hadColon = 1; |
ptr += MINBPC(enc); |
if (ptr == end) |
return XML_TOK_PARTIAL; |
switch (BYTE_TYPE(enc, ptr)) { |
CHECK_NMSTRT_CASES(enc, ptr, end, nextTokPtr) |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
break; |
#endif |
case BT_S: case BT_CR: case BT_LF: |
for (;;) { |
int t; |
ptr += MINBPC(enc); |
if (ptr == end) |
return XML_TOK_PARTIAL; |
t = BYTE_TYPE(enc, ptr); |
if (t == BT_EQUALS) |
break; |
switch (t) { |
case BT_S: |
case BT_LF: |
case BT_CR: |
break; |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
} |
/* fall through */ |
case BT_EQUALS: |
{ |
int open; |
#ifdef XML_NS |
hadColon = 0; |
#endif |
for (;;) { |
ptr += MINBPC(enc); |
if (ptr == end) |
return XML_TOK_PARTIAL; |
open = BYTE_TYPE(enc, ptr); |
if (open == BT_QUOT || open == BT_APOS) |
break; |
switch (open) { |
case BT_S: |
case BT_LF: |
case BT_CR: |
break; |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
} |
ptr += MINBPC(enc); |
/* in attribute value */ |
for (;;) { |
int t; |
if (ptr == end) |
return XML_TOK_PARTIAL; |
t = BYTE_TYPE(enc, ptr); |
if (t == open) |
break; |
switch (t) { |
INVALID_CASES(ptr, nextTokPtr) |
case BT_AMP: |
{ |
int tok = PREFIX(scanRef)(enc, ptr + MINBPC(enc), end, &ptr); |
if (tok <= 0) { |
if (tok == XML_TOK_INVALID) |
*nextTokPtr = ptr; |
return tok; |
} |
break; |
} |
case BT_LT: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
default: |
ptr += MINBPC(enc); |
break; |
} |
} |
ptr += MINBPC(enc); |
if (ptr == end) |
return XML_TOK_PARTIAL; |
switch (BYTE_TYPE(enc, ptr)) { |
case BT_S: |
case BT_CR: |
case BT_LF: |
break; |
case BT_SOL: |
goto sol; |
case BT_GT: |
goto gt; |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
/* ptr points to closing quote */ |
for (;;) { |
ptr += MINBPC(enc); |
if (ptr == end) |
return XML_TOK_PARTIAL; |
switch (BYTE_TYPE(enc, ptr)) { |
CHECK_NMSTRT_CASES(enc, ptr, end, nextTokPtr) |
case BT_S: case BT_CR: case BT_LF: |
continue; |
case BT_GT: |
gt: |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_START_TAG_WITH_ATTS; |
case BT_SOL: |
sol: |
ptr += MINBPC(enc); |
if (ptr == end) |
return XML_TOK_PARTIAL; |
if (!CHAR_MATCHES(enc, ptr, ASCII_GT)) { |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_EMPTY_ELEMENT_WITH_ATTS; |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
break; |
} |
break; |
} |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
} |
return XML_TOK_PARTIAL; |
} |
/* ptr points to character following "<" */ |
static int PTRCALL |
PREFIX(scanLt)(const ENCODING *enc, const char *ptr, const char *end, |
const char **nextTokPtr) |
{ |
#ifdef XML_NS |
int hadColon; |
#endif |
if (ptr == end) |
return XML_TOK_PARTIAL; |
switch (BYTE_TYPE(enc, ptr)) { |
CHECK_NMSTRT_CASES(enc, ptr, end, nextTokPtr) |
case BT_EXCL: |
if ((ptr += MINBPC(enc)) == end) |
return XML_TOK_PARTIAL; |
switch (BYTE_TYPE(enc, ptr)) { |
case BT_MINUS: |
return PREFIX(scanComment)(enc, ptr + MINBPC(enc), end, nextTokPtr); |
case BT_LSQB: |
return PREFIX(scanCdataSection)(enc, ptr + MINBPC(enc), |
end, nextTokPtr); |
} |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
case BT_QUEST: |
return PREFIX(scanPi)(enc, ptr + MINBPC(enc), end, nextTokPtr); |
case BT_SOL: |
return PREFIX(scanEndTag)(enc, ptr + MINBPC(enc), end, nextTokPtr); |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
#ifdef XML_NS |
hadColon = 0; |
#endif |
/* we have a start-tag */ |
while (ptr != end) { |
switch (BYTE_TYPE(enc, ptr)) { |
CHECK_NAME_CASES(enc, ptr, end, nextTokPtr) |
#ifdef XML_NS |
case BT_COLON: |
if (hadColon) { |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
hadColon = 1; |
ptr += MINBPC(enc); |
if (ptr == end) |
return XML_TOK_PARTIAL; |
switch (BYTE_TYPE(enc, ptr)) { |
CHECK_NMSTRT_CASES(enc, ptr, end, nextTokPtr) |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
break; |
#endif |
case BT_S: case BT_CR: case BT_LF: |
{ |
ptr += MINBPC(enc); |
while (ptr != end) { |
switch (BYTE_TYPE(enc, ptr)) { |
CHECK_NMSTRT_CASES(enc, ptr, end, nextTokPtr) |
case BT_GT: |
goto gt; |
case BT_SOL: |
goto sol; |
case BT_S: case BT_CR: case BT_LF: |
ptr += MINBPC(enc); |
continue; |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
return PREFIX(scanAtts)(enc, ptr, end, nextTokPtr); |
} |
return XML_TOK_PARTIAL; |
} |
case BT_GT: |
gt: |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_START_TAG_NO_ATTS; |
case BT_SOL: |
sol: |
ptr += MINBPC(enc); |
if (ptr == end) |
return XML_TOK_PARTIAL; |
if (!CHAR_MATCHES(enc, ptr, ASCII_GT)) { |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_EMPTY_ELEMENT_NO_ATTS; |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
} |
return XML_TOK_PARTIAL; |
} |
static int PTRCALL |
PREFIX(contentTok)(const ENCODING *enc, const char *ptr, const char *end, |
const char **nextTokPtr) |
{ |
if (ptr == end) |
return XML_TOK_NONE; |
if (MINBPC(enc) > 1) { |
size_t n = end - ptr; |
if (n & (MINBPC(enc) - 1)) { |
n &= ~(MINBPC(enc) - 1); |
if (n == 0) |
return XML_TOK_PARTIAL; |
end = ptr + n; |
} |
} |
switch (BYTE_TYPE(enc, ptr)) { |
case BT_LT: |
return PREFIX(scanLt)(enc, ptr + MINBPC(enc), end, nextTokPtr); |
case BT_AMP: |
return PREFIX(scanRef)(enc, ptr + MINBPC(enc), end, nextTokPtr); |
case BT_CR: |
ptr += MINBPC(enc); |
if (ptr == end) |
return XML_TOK_TRAILING_CR; |
if (BYTE_TYPE(enc, ptr) == BT_LF) |
ptr += MINBPC(enc); |
*nextTokPtr = ptr; |
return XML_TOK_DATA_NEWLINE; |
case BT_LF: |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_DATA_NEWLINE; |
case BT_RSQB: |
ptr += MINBPC(enc); |
if (ptr == end) |
return XML_TOK_TRAILING_RSQB; |
if (!CHAR_MATCHES(enc, ptr, ASCII_RSQB)) |
break; |
ptr += MINBPC(enc); |
if (ptr == end) |
return XML_TOK_TRAILING_RSQB; |
if (!CHAR_MATCHES(enc, ptr, ASCII_GT)) { |
ptr -= MINBPC(enc); |
break; |
} |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
INVALID_CASES(ptr, nextTokPtr) |
default: |
ptr += MINBPC(enc); |
break; |
} |
while (ptr != end) { |
switch (BYTE_TYPE(enc, ptr)) { |
#define LEAD_CASE(n) \ |
case BT_LEAD ## n: \ |
if (end - ptr < n || IS_INVALID_CHAR(enc, ptr, n)) { \ |
*nextTokPtr = ptr; \ |
return XML_TOK_DATA_CHARS; \ |
} \ |
ptr += n; \ |
break; |
LEAD_CASE(2) LEAD_CASE(3) LEAD_CASE(4) |
#undef LEAD_CASE |
case BT_RSQB: |
if (ptr + MINBPC(enc) != end) { |
if (!CHAR_MATCHES(enc, ptr + MINBPC(enc), ASCII_RSQB)) { |
ptr += MINBPC(enc); |
break; |
} |
if (ptr + 2*MINBPC(enc) != end) { |
if (!CHAR_MATCHES(enc, ptr + 2*MINBPC(enc), ASCII_GT)) { |
ptr += MINBPC(enc); |
break; |
} |
*nextTokPtr = ptr + 2*MINBPC(enc); |
return XML_TOK_INVALID; |
} |
} |
/* fall through */ |
case BT_AMP: |
case BT_LT: |
case BT_NONXML: |
case BT_MALFORM: |
case BT_TRAIL: |
case BT_CR: |
case BT_LF: |
*nextTokPtr = ptr; |
return XML_TOK_DATA_CHARS; |
default: |
ptr += MINBPC(enc); |
break; |
} |
} |
*nextTokPtr = ptr; |
return XML_TOK_DATA_CHARS; |
} |
/* ptr points to character following "%" */ |
static int PTRCALL |
PREFIX(scanPercent)(const ENCODING *enc, const char *ptr, const char *end, |
const char **nextTokPtr) |
{ |
if (ptr == end) |
return XML_TOK_PARTIAL; |
switch (BYTE_TYPE(enc, ptr)) { |
CHECK_NMSTRT_CASES(enc, ptr, end, nextTokPtr) |
case BT_S: case BT_LF: case BT_CR: case BT_PERCNT: |
*nextTokPtr = ptr; |
return XML_TOK_PERCENT; |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
while (ptr != end) { |
switch (BYTE_TYPE(enc, ptr)) { |
CHECK_NAME_CASES(enc, ptr, end, nextTokPtr) |
case BT_SEMI: |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_PARAM_ENTITY_REF; |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
} |
return XML_TOK_PARTIAL; |
} |
static int PTRCALL |
PREFIX(scanPoundName)(const ENCODING *enc, const char *ptr, const char *end, |
const char **nextTokPtr) |
{ |
if (ptr == end) |
return XML_TOK_PARTIAL; |
switch (BYTE_TYPE(enc, ptr)) { |
CHECK_NMSTRT_CASES(enc, ptr, end, nextTokPtr) |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
while (ptr != end) { |
switch (BYTE_TYPE(enc, ptr)) { |
CHECK_NAME_CASES(enc, ptr, end, nextTokPtr) |
case BT_CR: case BT_LF: case BT_S: |
case BT_RPAR: case BT_GT: case BT_PERCNT: case BT_VERBAR: |
*nextTokPtr = ptr; |
return XML_TOK_POUND_NAME; |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
} |
return -XML_TOK_POUND_NAME; |
} |
static int PTRCALL |
PREFIX(scanLit)(int open, const ENCODING *enc, |
const char *ptr, const char *end, |
const char **nextTokPtr) |
{ |
while (ptr != end) { |
int t = BYTE_TYPE(enc, ptr); |
switch (t) { |
INVALID_CASES(ptr, nextTokPtr) |
case BT_QUOT: |
case BT_APOS: |
ptr += MINBPC(enc); |
if (t != open) |
break; |
if (ptr == end) |
return -XML_TOK_LITERAL; |
*nextTokPtr = ptr; |
switch (BYTE_TYPE(enc, ptr)) { |
case BT_S: case BT_CR: case BT_LF: |
case BT_GT: case BT_PERCNT: case BT_LSQB: |
return XML_TOK_LITERAL; |
default: |
return XML_TOK_INVALID; |
} |
default: |
ptr += MINBPC(enc); |
break; |
} |
} |
return XML_TOK_PARTIAL; |
} |
static int PTRCALL |
PREFIX(prologTok)(const ENCODING *enc, const char *ptr, const char *end, |
const char **nextTokPtr) |
{ |
int tok; |
if (ptr == end) |
return XML_TOK_NONE; |
if (MINBPC(enc) > 1) { |
size_t n = end - ptr; |
if (n & (MINBPC(enc) - 1)) { |
n &= ~(MINBPC(enc) - 1); |
if (n == 0) |
return XML_TOK_PARTIAL; |
end = ptr + n; |
} |
} |
switch (BYTE_TYPE(enc, ptr)) { |
case BT_QUOT: |
return PREFIX(scanLit)(BT_QUOT, enc, ptr + MINBPC(enc), end, nextTokPtr); |
case BT_APOS: |
return PREFIX(scanLit)(BT_APOS, enc, ptr + MINBPC(enc), end, nextTokPtr); |
case BT_LT: |
{ |
ptr += MINBPC(enc); |
if (ptr == end) |
return XML_TOK_PARTIAL; |
switch (BYTE_TYPE(enc, ptr)) { |
case BT_EXCL: |
return PREFIX(scanDecl)(enc, ptr + MINBPC(enc), end, nextTokPtr); |
case BT_QUEST: |
return PREFIX(scanPi)(enc, ptr + MINBPC(enc), end, nextTokPtr); |
case BT_NMSTRT: |
case BT_HEX: |
case BT_NONASCII: |
case BT_LEAD2: |
case BT_LEAD3: |
case BT_LEAD4: |
*nextTokPtr = ptr - MINBPC(enc); |
return XML_TOK_INSTANCE_START; |
} |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
case BT_CR: |
if (ptr + MINBPC(enc) == end) { |
*nextTokPtr = end; |
/* indicate that this might be part of a CR/LF pair */ |
return -XML_TOK_PROLOG_S; |
} |
/* fall through */ |
case BT_S: case BT_LF: |
for (;;) { |
ptr += MINBPC(enc); |
if (ptr == end) |
break; |
switch (BYTE_TYPE(enc, ptr)) { |
case BT_S: case BT_LF: |
break; |
case BT_CR: |
/* don't split CR/LF pair */ |
if (ptr + MINBPC(enc) != end) |
break; |
/* fall through */ |
default: |
*nextTokPtr = ptr; |
return XML_TOK_PROLOG_S; |
} |
} |
*nextTokPtr = ptr; |
return XML_TOK_PROLOG_S; |
case BT_PERCNT: |
return PREFIX(scanPercent)(enc, ptr + MINBPC(enc), end, nextTokPtr); |
case BT_COMMA: |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_COMMA; |
case BT_LSQB: |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_OPEN_BRACKET; |
case BT_RSQB: |
ptr += MINBPC(enc); |
if (ptr == end) |
return -XML_TOK_CLOSE_BRACKET; |
if (CHAR_MATCHES(enc, ptr, ASCII_RSQB)) { |
if (ptr + MINBPC(enc) == end) |
return XML_TOK_PARTIAL; |
if (CHAR_MATCHES(enc, ptr + MINBPC(enc), ASCII_GT)) { |
*nextTokPtr = ptr + 2*MINBPC(enc); |
return XML_TOK_COND_SECT_CLOSE; |
} |
} |
*nextTokPtr = ptr; |
return XML_TOK_CLOSE_BRACKET; |
case BT_LPAR: |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_OPEN_PAREN; |
case BT_RPAR: |
ptr += MINBPC(enc); |
if (ptr == end) |
return -XML_TOK_CLOSE_PAREN; |
switch (BYTE_TYPE(enc, ptr)) { |
case BT_AST: |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_CLOSE_PAREN_ASTERISK; |
case BT_QUEST: |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_CLOSE_PAREN_QUESTION; |
case BT_PLUS: |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_CLOSE_PAREN_PLUS; |
case BT_CR: case BT_LF: case BT_S: |
case BT_GT: case BT_COMMA: case BT_VERBAR: |
case BT_RPAR: |
*nextTokPtr = ptr; |
return XML_TOK_CLOSE_PAREN; |
} |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
case BT_VERBAR: |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_OR; |
case BT_GT: |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_DECL_CLOSE; |
case BT_NUM: |
return PREFIX(scanPoundName)(enc, ptr + MINBPC(enc), end, nextTokPtr); |
#define LEAD_CASE(n) \ |
case BT_LEAD ## n: \ |
if (end - ptr < n) \ |
return XML_TOK_PARTIAL_CHAR; \ |
if (IS_NMSTRT_CHAR(enc, ptr, n)) { \ |
ptr += n; \ |
tok = XML_TOK_NAME; \ |
break; \ |
} \ |
if (IS_NAME_CHAR(enc, ptr, n)) { \ |
ptr += n; \ |
tok = XML_TOK_NMTOKEN; \ |
break; \ |
} \ |
*nextTokPtr = ptr; \ |
return XML_TOK_INVALID; |
LEAD_CASE(2) LEAD_CASE(3) LEAD_CASE(4) |
#undef LEAD_CASE |
case BT_NMSTRT: |
case BT_HEX: |
tok = XML_TOK_NAME; |
ptr += MINBPC(enc); |
break; |
case BT_DIGIT: |
case BT_NAME: |
case BT_MINUS: |
#ifdef XML_NS |
case BT_COLON: |
#endif |
tok = XML_TOK_NMTOKEN; |
ptr += MINBPC(enc); |
break; |
case BT_NONASCII: |
if (IS_NMSTRT_CHAR_MINBPC(enc, ptr)) { |
ptr += MINBPC(enc); |
tok = XML_TOK_NAME; |
break; |
} |
if (IS_NAME_CHAR_MINBPC(enc, ptr)) { |
ptr += MINBPC(enc); |
tok = XML_TOK_NMTOKEN; |
break; |
} |
/* fall through */ |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
while (ptr != end) { |
switch (BYTE_TYPE(enc, ptr)) { |
CHECK_NAME_CASES(enc, ptr, end, nextTokPtr) |
case BT_GT: case BT_RPAR: case BT_COMMA: |
case BT_VERBAR: case BT_LSQB: case BT_PERCNT: |
case BT_S: case BT_CR: case BT_LF: |
*nextTokPtr = ptr; |
return tok; |
#ifdef XML_NS |
case BT_COLON: |
ptr += MINBPC(enc); |
switch (tok) { |
case XML_TOK_NAME: |
if (ptr == end) |
return XML_TOK_PARTIAL; |
tok = XML_TOK_PREFIXED_NAME; |
switch (BYTE_TYPE(enc, ptr)) { |
CHECK_NAME_CASES(enc, ptr, end, nextTokPtr) |
default: |
tok = XML_TOK_NMTOKEN; |
break; |
} |
break; |
case XML_TOK_PREFIXED_NAME: |
tok = XML_TOK_NMTOKEN; |
break; |
} |
break; |
#endif |
case BT_PLUS: |
if (tok == XML_TOK_NMTOKEN) { |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_NAME_PLUS; |
case BT_AST: |
if (tok == XML_TOK_NMTOKEN) { |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_NAME_ASTERISK; |
case BT_QUEST: |
if (tok == XML_TOK_NMTOKEN) { |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_NAME_QUESTION; |
default: |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
} |
} |
return -tok; |
} |
static int PTRCALL |
PREFIX(attributeValueTok)(const ENCODING *enc, const char *ptr, |
const char *end, const char **nextTokPtr) |
{ |
const char *start; |
if (ptr == end) |
return XML_TOK_NONE; |
start = ptr; |
while (ptr != end) { |
switch (BYTE_TYPE(enc, ptr)) { |
#define LEAD_CASE(n) \ |
case BT_LEAD ## n: ptr += n; break; |
LEAD_CASE(2) LEAD_CASE(3) LEAD_CASE(4) |
#undef LEAD_CASE |
case BT_AMP: |
if (ptr == start) |
return PREFIX(scanRef)(enc, ptr + MINBPC(enc), end, nextTokPtr); |
*nextTokPtr = ptr; |
return XML_TOK_DATA_CHARS; |
case BT_LT: |
/* this is for inside entity references */ |
*nextTokPtr = ptr; |
return XML_TOK_INVALID; |
case BT_LF: |
if (ptr == start) { |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_DATA_NEWLINE; |
} |
*nextTokPtr = ptr; |
return XML_TOK_DATA_CHARS; |
case BT_CR: |
if (ptr == start) { |
ptr += MINBPC(enc); |
if (ptr == end) |
return XML_TOK_TRAILING_CR; |
if (BYTE_TYPE(enc, ptr) == BT_LF) |
ptr += MINBPC(enc); |
*nextTokPtr = ptr; |
return XML_TOK_DATA_NEWLINE; |
} |
*nextTokPtr = ptr; |
return XML_TOK_DATA_CHARS; |
case BT_S: |
if (ptr == start) { |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_ATTRIBUTE_VALUE_S; |
} |
*nextTokPtr = ptr; |
return XML_TOK_DATA_CHARS; |
default: |
ptr += MINBPC(enc); |
break; |
} |
} |
*nextTokPtr = ptr; |
return XML_TOK_DATA_CHARS; |
} |
static int PTRCALL |
PREFIX(entityValueTok)(const ENCODING *enc, const char *ptr, |
const char *end, const char **nextTokPtr) |
{ |
const char *start; |
if (ptr == end) |
return XML_TOK_NONE; |
start = ptr; |
while (ptr != end) { |
switch (BYTE_TYPE(enc, ptr)) { |
#define LEAD_CASE(n) \ |
case BT_LEAD ## n: ptr += n; break; |
LEAD_CASE(2) LEAD_CASE(3) LEAD_CASE(4) |
#undef LEAD_CASE |
case BT_AMP: |
if (ptr == start) |
return PREFIX(scanRef)(enc, ptr + MINBPC(enc), end, nextTokPtr); |
*nextTokPtr = ptr; |
return XML_TOK_DATA_CHARS; |
case BT_PERCNT: |
if (ptr == start) { |
int tok = PREFIX(scanPercent)(enc, ptr + MINBPC(enc), |
end, nextTokPtr); |
return (tok == XML_TOK_PERCENT) ? XML_TOK_INVALID : tok; |
} |
*nextTokPtr = ptr; |
return XML_TOK_DATA_CHARS; |
case BT_LF: |
if (ptr == start) { |
*nextTokPtr = ptr + MINBPC(enc); |
return XML_TOK_DATA_NEWLINE; |
} |
*nextTokPtr = ptr; |
return XML_TOK_DATA_CHARS; |
case BT_CR: |
if (ptr == start) { |
ptr += MINBPC(enc); |
if (ptr == end) |
return XML_TOK_TRAILING_CR; |
if (BYTE_TYPE(enc, ptr) == BT_LF) |
ptr += MINBPC(enc); |
*nextTokPtr = ptr; |
return XML_TOK_DATA_NEWLINE; |
} |
*nextTokPtr = ptr; |
return XML_TOK_DATA_CHARS; |
default: |
ptr += MINBPC(enc); |
break; |
} |
} |
*nextTokPtr = ptr; |
return XML_TOK_DATA_CHARS; |
} |
#ifdef XML_DTD |
static int PTRCALL |
PREFIX(ignoreSectionTok)(const ENCODING *enc, const char *ptr, |
const char *end, const char **nextTokPtr) |
{ |
int level = 0; |
if (MINBPC(enc) > 1) { |
size_t n = end - ptr; |
if (n & (MINBPC(enc) - 1)) { |
n &= ~(MINBPC(enc) - 1); |
end = ptr + n; |
} |
} |
while (ptr != end) { |
switch (BYTE_TYPE(enc, ptr)) { |
INVALID_CASES(ptr, nextTokPtr) |
case BT_LT: |
if ((ptr += MINBPC(enc)) == end) |
return XML_TOK_PARTIAL; |
if (CHAR_MATCHES(enc, ptr, ASCII_EXCL)) { |
if ((ptr += MINBPC(enc)) == end) |
return XML_TOK_PARTIAL; |
if (CHAR_MATCHES(enc, ptr, ASCII_LSQB)) { |
++level; |
ptr += MINBPC(enc); |
} |
} |
break; |
case BT_RSQB: |
if ((ptr += MINBPC(enc)) == end) |
return XML_TOK_PARTIAL; |
if (CHAR_MATCHES(enc, ptr, ASCII_RSQB)) { |
if ((ptr += MINBPC(enc)) == end) |
return XML_TOK_PARTIAL; |
if (CHAR_MATCHES(enc, ptr, ASCII_GT)) { |
ptr += MINBPC(enc); |
if (level == 0) { |
*nextTokPtr = ptr; |
return XML_TOK_IGNORE_SECT; |
} |
--level; |
} |
} |
break; |
default: |
ptr += MINBPC(enc); |
break; |
} |
} |
return XML_TOK_PARTIAL; |
} |
#endif /* XML_DTD */ |
static int PTRCALL |
PREFIX(isPublicId)(const ENCODING *enc, const char *ptr, const char *end, |
const char **badPtr) |
{ |
ptr += MINBPC(enc); |
end -= MINBPC(enc); |
for (; ptr != end; ptr += MINBPC(enc)) { |
switch (BYTE_TYPE(enc, ptr)) { |
case BT_DIGIT: |
case BT_HEX: |
case BT_MINUS: |
case BT_APOS: |
case BT_LPAR: |
case BT_RPAR: |
case BT_PLUS: |
case BT_COMMA: |
case BT_SOL: |
case BT_EQUALS: |
case BT_QUEST: |
case BT_CR: |
case BT_LF: |
case BT_SEMI: |
case BT_EXCL: |
case BT_AST: |
case BT_PERCNT: |
case BT_NUM: |
#ifdef XML_NS |
case BT_COLON: |
#endif |
break; |
case BT_S: |
if (CHAR_MATCHES(enc, ptr, ASCII_TAB)) { |
*badPtr = ptr; |
return 0; |
} |
break; |
case BT_NAME: |
case BT_NMSTRT: |
if (!(BYTE_TO_ASCII(enc, ptr) & ~0x7f)) |
break; |
default: |
switch (BYTE_TO_ASCII(enc, ptr)) { |
case 0x24: /* $ */ |
case 0x40: /* @ */ |
break; |
default: |
*badPtr = ptr; |
return 0; |
} |
break; |
} |
} |
return 1; |
} |
/* This must only be called for a well-formed start-tag or empty |
element tag. Returns the number of attributes. Pointers to the |
first attsMax attributes are stored in atts. |
*/ |
static int PTRCALL |
PREFIX(getAtts)(const ENCODING *enc, const char *ptr, |
int attsMax, ATTRIBUTE *atts) |
{ |
enum { other, inName, inValue } state = inName; |
int nAtts = 0; |
int open = 0; /* defined when state == inValue; |
initialization just to shut up compilers */ |
for (ptr += MINBPC(enc);; ptr += MINBPC(enc)) { |
switch (BYTE_TYPE(enc, ptr)) { |
#define START_NAME \ |
if (state == other) { \ |
if (nAtts < attsMax) { \ |
atts[nAtts].name = ptr; \ |
atts[nAtts].normalized = 1; \ |
} \ |
state = inName; \ |
} |
#define LEAD_CASE(n) \ |
case BT_LEAD ## n: START_NAME ptr += (n - MINBPC(enc)); break; |
LEAD_CASE(2) LEAD_CASE(3) LEAD_CASE(4) |
#undef LEAD_CASE |
case BT_NONASCII: |
case BT_NMSTRT: |
case BT_HEX: |
START_NAME |
break; |
#undef START_NAME |
case BT_QUOT: |
if (state != inValue) { |
if (nAtts < attsMax) |
atts[nAtts].valuePtr = ptr + MINBPC(enc); |
state = inValue; |
open = BT_QUOT; |
} |
else if (open == BT_QUOT) { |
state = other; |
if (nAtts < attsMax) |
atts[nAtts].valueEnd = ptr; |
nAtts++; |
} |
break; |
case BT_APOS: |
if (state != inValue) { |
if (nAtts < attsMax) |
atts[nAtts].valuePtr = ptr + MINBPC(enc); |
state = inValue; |
open = BT_APOS; |
} |
else if (open == BT_APOS) { |
state = other; |
if (nAtts < attsMax) |
atts[nAtts].valueEnd = ptr; |
nAtts++; |
} |
break; |
case BT_AMP: |
if (nAtts < attsMax) |
atts[nAtts].normalized = 0; |
break; |
case BT_S: |
if (state == inName) |
state = other; |
else if (state == inValue |
&& nAtts < attsMax |
&& atts[nAtts].normalized |
&& (ptr == atts[nAtts].valuePtr |
|| BYTE_TO_ASCII(enc, ptr) != ASCII_SPACE |
|| BYTE_TO_ASCII(enc, ptr + MINBPC(enc)) == ASCII_SPACE |
|| BYTE_TYPE(enc, ptr + MINBPC(enc)) == open)) |
atts[nAtts].normalized = 0; |
break; |
case BT_CR: case BT_LF: |
/* This case ensures that the first attribute name is counted |
Apart from that we could just change state on the quote. */ |
if (state == inName) |
state = other; |
else if (state == inValue && nAtts < attsMax) |
atts[nAtts].normalized = 0; |
break; |
case BT_GT: |
case BT_SOL: |
if (state != inValue) |
return nAtts; |
break; |
default: |
break; |
} |
} |
/* not reached */ |
} |
static int PTRFASTCALL |
PREFIX(charRefNumber)(const ENCODING *enc, const char *ptr) |
{ |
int result = 0; |
/* skip &# */ |
ptr += 2*MINBPC(enc); |
if (CHAR_MATCHES(enc, ptr, ASCII_x)) { |
for (ptr += MINBPC(enc); |
!CHAR_MATCHES(enc, ptr, ASCII_SEMI); |
ptr += MINBPC(enc)) { |
int c = BYTE_TO_ASCII(enc, ptr); |
switch (c) { |
case ASCII_0: case ASCII_1: case ASCII_2: case ASCII_3: case ASCII_4: |
case ASCII_5: case ASCII_6: case ASCII_7: case ASCII_8: case ASCII_9: |
result <<= 4; |
result |= (c - ASCII_0); |
break; |
case ASCII_A: case ASCII_B: case ASCII_C: |
case ASCII_D: case ASCII_E: case ASCII_F: |
result <<= 4; |
result += 10 + (c - ASCII_A); |
break; |
case ASCII_a: case ASCII_b: case ASCII_c: |
case ASCII_d: case ASCII_e: case ASCII_f: |
result <<= 4; |
result += 10 + (c - ASCII_a); |
break; |
} |
if (result >= 0x110000) |
return -1; |
} |
} |
else { |
for (; !CHAR_MATCHES(enc, ptr, ASCII_SEMI); ptr += MINBPC(enc)) { |
int c = BYTE_TO_ASCII(enc, ptr); |
result *= 10; |
result += (c - ASCII_0); |
if (result >= 0x110000) |
return -1; |
} |
} |
return checkCharRefNumber(result); |
} |
static int PTRCALL |
PREFIX(predefinedEntityName)(const ENCODING *enc, const char *ptr, |
const char *end) |
{ |
switch ((end - ptr)/MINBPC(enc)) { |
case 2: |
if (CHAR_MATCHES(enc, ptr + MINBPC(enc), ASCII_t)) { |
switch (BYTE_TO_ASCII(enc, ptr)) { |
case ASCII_l: |
return ASCII_LT; |
case ASCII_g: |
return ASCII_GT; |
} |
} |
break; |
case 3: |
if (CHAR_MATCHES(enc, ptr, ASCII_a)) { |
ptr += MINBPC(enc); |
if (CHAR_MATCHES(enc, ptr, ASCII_m)) { |
ptr += MINBPC(enc); |
if (CHAR_MATCHES(enc, ptr, ASCII_p)) |
return ASCII_AMP; |
} |
} |
break; |
case 4: |
switch (BYTE_TO_ASCII(enc, ptr)) { |
case ASCII_q: |
ptr += MINBPC(enc); |
if (CHAR_MATCHES(enc, ptr, ASCII_u)) { |
ptr += MINBPC(enc); |
if (CHAR_MATCHES(enc, ptr, ASCII_o)) { |
ptr += MINBPC(enc); |
if (CHAR_MATCHES(enc, ptr, ASCII_t)) |
return ASCII_QUOT; |
} |
} |
break; |
case ASCII_a: |
ptr += MINBPC(enc); |
if (CHAR_MATCHES(enc, ptr, ASCII_p)) { |
ptr += MINBPC(enc); |
if (CHAR_MATCHES(enc, ptr, ASCII_o)) { |
ptr += MINBPC(enc); |
if (CHAR_MATCHES(enc, ptr, ASCII_s)) |
return ASCII_APOS; |
} |
} |
break; |
} |
} |
return 0; |
} |
static int PTRCALL |
PREFIX(sameName)(const ENCODING *enc, const char *ptr1, const char *ptr2) |
{ |
for (;;) { |
switch (BYTE_TYPE(enc, ptr1)) { |
#define LEAD_CASE(n) \ |
case BT_LEAD ## n: \ |
if (*ptr1++ != *ptr2++) \ |
return 0; |
LEAD_CASE(4) LEAD_CASE(3) LEAD_CASE(2) |
#undef LEAD_CASE |
/* fall through */ |
if (*ptr1++ != *ptr2++) |
return 0; |
break; |
case BT_NONASCII: |
case BT_NMSTRT: |
#ifdef XML_NS |
case BT_COLON: |
#endif |
case BT_HEX: |
case BT_DIGIT: |
case BT_NAME: |
case BT_MINUS: |
if (*ptr2++ != *ptr1++) |
return 0; |
if (MINBPC(enc) > 1) { |
if (*ptr2++ != *ptr1++) |
return 0; |
if (MINBPC(enc) > 2) { |
if (*ptr2++ != *ptr1++) |
return 0; |
if (MINBPC(enc) > 3) { |
if (*ptr2++ != *ptr1++) |
return 0; |
} |
} |
} |
break; |
default: |
if (MINBPC(enc) == 1 && *ptr1 == *ptr2) |
return 1; |
switch (BYTE_TYPE(enc, ptr2)) { |
case BT_LEAD2: |
case BT_LEAD3: |
case BT_LEAD4: |
case BT_NONASCII: |
case BT_NMSTRT: |
#ifdef XML_NS |
case BT_COLON: |
#endif |
case BT_HEX: |
case BT_DIGIT: |
case BT_NAME: |
case BT_MINUS: |
return 0; |
default: |
return 1; |
} |
} |
} |
/* not reached */ |
} |
static int PTRCALL |
PREFIX(nameMatchesAscii)(const ENCODING *enc, const char *ptr1, |
const char *end1, const char *ptr2) |
{ |
for (; *ptr2; ptr1 += MINBPC(enc), ptr2++) { |
if (ptr1 == end1) |
return 0; |
if (!CHAR_MATCHES(enc, ptr1, *ptr2)) |
return 0; |
} |
return ptr1 == end1; |
} |
static int PTRFASTCALL |
PREFIX(nameLength)(const ENCODING *enc, const char *ptr) |
{ |
const char *start = ptr; |
for (;;) { |
switch (BYTE_TYPE(enc, ptr)) { |
#define LEAD_CASE(n) \ |
case BT_LEAD ## n: ptr += n; break; |
LEAD_CASE(2) LEAD_CASE(3) LEAD_CASE(4) |
#undef LEAD_CASE |
case BT_NONASCII: |
case BT_NMSTRT: |
#ifdef XML_NS |
case BT_COLON: |
#endif |
case BT_HEX: |
case BT_DIGIT: |
case BT_NAME: |
case BT_MINUS: |
ptr += MINBPC(enc); |
break; |
default: |
return (int)(ptr - start); |
} |
} |
} |
static const char * PTRFASTCALL |
PREFIX(skipS)(const ENCODING *enc, const char *ptr) |
{ |
for (;;) { |
switch (BYTE_TYPE(enc, ptr)) { |
case BT_LF: |
case BT_CR: |
case BT_S: |
ptr += MINBPC(enc); |
break; |
default: |
return ptr; |
} |
} |
} |
static void PTRCALL |
PREFIX(updatePosition)(const ENCODING *enc, |
const char *ptr, |
const char *end, |
POSITION *pos) |
{ |
while (ptr < end) { |
switch (BYTE_TYPE(enc, ptr)) { |
#define LEAD_CASE(n) \ |
case BT_LEAD ## n: \ |
ptr += n; \ |
break; |
LEAD_CASE(2) LEAD_CASE(3) LEAD_CASE(4) |
#undef LEAD_CASE |
case BT_LF: |
pos->columnNumber = (XML_Size)-1; |
pos->lineNumber++; |
ptr += MINBPC(enc); |
break; |
case BT_CR: |
pos->lineNumber++; |
ptr += MINBPC(enc); |
if (ptr != end && BYTE_TYPE(enc, ptr) == BT_LF) |
ptr += MINBPC(enc); |
pos->columnNumber = (XML_Size)-1; |
break; |
default: |
ptr += MINBPC(enc); |
break; |
} |
pos->columnNumber++; |
} |
} |
#undef DO_LEAD_CASE |
#undef MULTIBYTE_CASES |
#undef INVALID_CASES |
#undef CHECK_NAME_CASE |
#undef CHECK_NAME_CASES |
#undef CHECK_NMSTRT_CASE |
#undef CHECK_NMSTRT_CASES |
#endif /* XML_TOK_IMPL_C */ |
/contrib/sdk/sources/expat/lib/xmltok_impl.h |
---|
0,0 → 1,46 |
/* |
Copyright (c) 1998, 1999 Thai Open Source Software Center Ltd |
See the file COPYING for copying permission. |
*/ |
enum { |
BT_NONXML, |
BT_MALFORM, |
BT_LT, |
BT_AMP, |
BT_RSQB, |
BT_LEAD2, |
BT_LEAD3, |
BT_LEAD4, |
BT_TRAIL, |
BT_CR, |
BT_LF, |
BT_GT, |
BT_QUOT, |
BT_APOS, |
BT_EQUALS, |
BT_QUEST, |
BT_EXCL, |
BT_SOL, |
BT_SEMI, |
BT_NUM, |
BT_LSQB, |
BT_S, |
BT_NMSTRT, |
BT_COLON, |
BT_HEX, |
BT_DIGIT, |
BT_NAME, |
BT_MINUS, |
BT_OTHER, /* known not to be a name or name start character */ |
BT_NONASCII, /* might be a name or name start character */ |
BT_PERCNT, |
BT_LPAR, |
BT_RPAR, |
BT_AST, |
BT_PLUS, |
BT_COMMA, |
BT_VERBAR |
}; |
#include <stddef.h> |
/contrib/sdk/sources/expat/lib/xmltok_ns.c |
---|
0,0 → 1,115 |
/* Copyright (c) 1998, 1999 Thai Open Source Software Center Ltd |
See the file COPYING for copying permission. |
*/ |
/* This file is included! */ |
#ifdef XML_TOK_NS_C |
const ENCODING * |
NS(XmlGetUtf8InternalEncoding)(void) |
{ |
return &ns(internal_utf8_encoding).enc; |
} |
const ENCODING * |
NS(XmlGetUtf16InternalEncoding)(void) |
{ |
#if BYTEORDER == 1234 |
return &ns(internal_little2_encoding).enc; |
#elif BYTEORDER == 4321 |
return &ns(internal_big2_encoding).enc; |
#else |
const short n = 1; |
return (*(const char *)&n |
? &ns(internal_little2_encoding).enc |
: &ns(internal_big2_encoding).enc); |
#endif |
} |
static const ENCODING * const NS(encodings)[] = { |
&ns(latin1_encoding).enc, |
&ns(ascii_encoding).enc, |
&ns(utf8_encoding).enc, |
&ns(big2_encoding).enc, |
&ns(big2_encoding).enc, |
&ns(little2_encoding).enc, |
&ns(utf8_encoding).enc /* NO_ENC */ |
}; |
static int PTRCALL |
NS(initScanProlog)(const ENCODING *enc, const char *ptr, const char *end, |
const char **nextTokPtr) |
{ |
return initScan(NS(encodings), (const INIT_ENCODING *)enc, |
XML_PROLOG_STATE, ptr, end, nextTokPtr); |
} |
static int PTRCALL |
NS(initScanContent)(const ENCODING *enc, const char *ptr, const char *end, |
const char **nextTokPtr) |
{ |
return initScan(NS(encodings), (const INIT_ENCODING *)enc, |
XML_CONTENT_STATE, ptr, end, nextTokPtr); |
} |
int |
NS(XmlInitEncoding)(INIT_ENCODING *p, const ENCODING **encPtr, |
const char *name) |
{ |
int i = getEncodingIndex(name); |
if (i == UNKNOWN_ENC) |
return 0; |
SET_INIT_ENC_INDEX(p, i); |
p->initEnc.scanners[XML_PROLOG_STATE] = NS(initScanProlog); |
p->initEnc.scanners[XML_CONTENT_STATE] = NS(initScanContent); |
p->initEnc.updatePosition = initUpdatePosition; |
p->encPtr = encPtr; |
*encPtr = &(p->initEnc); |
return 1; |
} |
static const ENCODING * |
NS(findEncoding)(const ENCODING *enc, const char *ptr, const char *end) |
{ |
#define ENCODING_MAX 128 |
char buf[ENCODING_MAX]; |
char *p = buf; |
int i; |
XmlUtf8Convert(enc, &ptr, end, &p, p + ENCODING_MAX - 1); |
if (ptr != end) |
return 0; |
*p = 0; |
if (streqci(buf, KW_UTF_16) && enc->minBytesPerChar == 2) |
return enc; |
i = getEncodingIndex(buf); |
if (i == UNKNOWN_ENC) |
return 0; |
return NS(encodings)[i]; |
} |
int |
NS(XmlParseXmlDecl)(int isGeneralTextEntity, |
const ENCODING *enc, |
const char *ptr, |
const char *end, |
const char **badPtr, |
const char **versionPtr, |
const char **versionEndPtr, |
const char **encodingName, |
const ENCODING **encoding, |
int *standalone) |
{ |
return doParseXmlDecl(NS(findEncoding), |
isGeneralTextEntity, |
enc, |
ptr, |
end, |
badPtr, |
versionPtr, |
versionEndPtr, |
encodingName, |
encoding, |
standalone); |
} |
#endif /* XML_TOK_NS_C */ |