Merged in development from the main REXML repository.

* Fixed bug #34, typo in xpath_parser.
* Previous fix, (include? -> includes?) was incorrect.
* Added another test for encoding
* Started AnyName support in RelaxNG
* Added Element#Attributes#to_a, so that it does something intelligent.
  This was needed by XPath, for '@*'
* Fixed XPath so that @* works.
* Added xmlgrep to the bin/ directory.  A little tool allowing you to grep
  for XPaths in an XML document.
* Fixed a CDATA pretty-printing bug. (#39)
* Fixed a buffering bug in Source.rb that affected the SAX parser
  This bug was related to how REXML determines the encoding of a file, and
  evinced itself by hanging on input when using the SAX parser.
* The unit test for the previous patch.  Forgot to commit it.
* Minor pretty printing fix.
* Applied Curt Sampson's optimization improvements
* Issue #9; 3.1.3: The SAX parser was not denormalizing entity references
  in incoming text.  All declared internal entities, as well as numeric
  entities, should now be denormalized.  There was a related bug in that the
  SAX parser was actually double-encoding entities; this is also fixed.
* bin/* programs should now be executable.  Setting bin apps to executable
* Issue 14; 3.1.3: DTD events are now all being passed by StreamParser
  Some of the DTD events were not being passed through by the stream parser.
* #26: Element#add_element(nil) now raises an error Changed XPath searches so
  that if a non-Hash is passed, an error is raised Fixed a spurrious undefined
  method error in encoding.  #29: XPath ordering bug fixed by Mark Williams.
  Incidentally, Mark supplied a superlative bug report, including a full unit
  test.  Then he went ahead and fixed the bug.  It doesn't get any better than
  this, folks.
* Fixed a broken link.  Thanks to Dick Davies for pointing it out.  Added
  functions courtesy of Michael Neumann <mneumann@xxxx.de>.
  Example code to follow.
* Added Michael's sample code.  Merged the changes in from branches/xpath_V
* Fixed preceding:: and following:: axis Fixed the ordering bug that Martin
  Fowler reported.
* Uncommented some code commented for testing Applied Nobu's changes to the
  Encoding infrastructure, which should fix potential threading issues.
* Added more tests, and the missing syncenumerator class.  Fixed the
  inheritance bug in the pull parser that James Britt found.  Indentation
  changes, and changed some exceptions to runtime
  exceptions.
* Changes by Matz, mostly of indent -> indent_level, to avoid
  function/variable naming conflicts
* Tabs -> spaces (whitespace)

Note the addition of syncenumerator.rb.  This is a stopgap, until I can work on
the class enough to get it accepted as a replacement for the SyncEnumerator
that comes with the Generator class.  My version is orders of magnitude faster
than the Generator SyncEnumerator, but is currently missing a couple of
features of the original.  Eventually, I expect this class to migrate to
another part of the source tree.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@8483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
This commit is contained in:
ser 2005-05-19 02:58:11 +00:00
Родитель a399253153
Коммит 21e8df5c10
15 изменённых файлов: 1209 добавлений и 865 удалений

Просмотреть файл

@ -16,164 +16,166 @@ module REXML
# Document has a single child that can be accessed by root(). # Document has a single child that can be accessed by root().
# Note that if you want to have an XML declaration written for a document # Note that if you want to have an XML declaration written for a document
# you create, you must add one; REXML documents do not write a default # you create, you must add one; REXML documents do not write a default
# declaration for you. See |DECLARATION| and |write|. # declaration for you. See |DECLARATION| and |write|.
class Document < Element class Document < Element
# A convenient default XML declaration. If you want an XML declaration, # A convenient default XML declaration. If you want an XML declaration,
# the easiest way to add one is mydoc << Document::DECLARATION # the easiest way to add one is mydoc << Document::DECLARATION
# +DEPRECATED+ # +DEPRECATED+
# Use: mydoc << XMLDecl.default # Use: mydoc << XMLDecl.default
DECLARATION = XMLDecl.default DECLARATION = XMLDecl.default
# Constructor # Constructor
# @param source if supplied, must be a Document, String, or IO. # @param source if supplied, must be a Document, String, or IO.
# Documents have their context and Element attributes cloned. # Documents have their context and Element attributes cloned.
# Strings are expected to be valid XML documents. IOs are expected # Strings are expected to be valid XML documents. IOs are expected
# to be sources of valid XML documents. # to be sources of valid XML documents.
# @param context if supplied, contains the context of the document; # @param context if supplied, contains the context of the document;
# this should be a Hash. # this should be a Hash.
# NOTE that I'm not sure what the context is for; I cloned it out of # NOTE that I'm not sure what the context is for; I cloned it out of
# the Electric XML API (in which it also seems to do nothing), and it # the Electric XML API (in which it also seems to do nothing), and it
# is now legacy. It may do something, someday... it may disappear. # is now legacy. It may do something, someday... it may disappear.
def initialize( source = nil, context = {} ) def initialize( source = nil, context = {} )
super() super()
@context = context @context = context
return if source.nil? return if source.nil?
if source.kind_of? Document if source.kind_of? Document
@context = source.context @context = source.context
super source super source
else else
build( source ) build( source )
end end
end end
def node_type def node_type
:document :document
end end
# Should be obvious # Should be obvious
def clone def clone
Document.new self Document.new self
end end
# According to the XML spec, a root node has no expanded name # According to the XML spec, a root node has no expanded name
def expanded_name def expanded_name
'' ''
#d = doc_type #d = doc_type
#d ? d.name : "UNDEFINED" #d ? d.name : "UNDEFINED"
end end
alias :name :expanded_name alias :name :expanded_name
# We override this, because XMLDecls and DocTypes must go at the start # We override this, because XMLDecls and DocTypes must go at the start
# of the document # of the document
def add( child ) def add( child )
if child.kind_of? XMLDecl if child.kind_of? XMLDecl
@children.unshift child @children.unshift child
elsif child.kind_of? DocType elsif child.kind_of? DocType
if @children[0].kind_of? XMLDecl if @children[0].kind_of? XMLDecl
@children[1,0] = child @children[1,0] = child
else else
@children.unshift child @children.unshift child
end end
child.parent = self child.parent = self
else else
rv = super rv = super
raise "attempted adding second root element to document" if @elements.size > 1 raise "attempted adding second root element to document" if @elements.size > 1
rv rv
end end
end end
alias :<< :add alias :<< :add
def add_element(arg=nil, arg2=nil) def add_element(arg=nil, arg2=nil)
rv = super rv = super
raise "attempted adding second root element to document" if @elements.size > 1 raise "attempted adding second root element to document" if @elements.size > 1
rv rv
end end
# @return the root Element of the document, or nil if this document # @return the root Element of the document, or nil if this document
# has no children. # has no children.
def root def root
@children.find { |item| item.kind_of? Element } elements[1]
end #self
#@children.find { |item| item.kind_of? Element }
end
# @return the DocType child of the document, if one exists, # @return the DocType child of the document, if one exists,
# and nil otherwise. # and nil otherwise.
def doctype def doctype
@children.find { |item| item.kind_of? DocType } @children.find { |item| item.kind_of? DocType }
end end
# @return the XMLDecl of this document; if no XMLDecl has been # @return the XMLDecl of this document; if no XMLDecl has been
# set, the default declaration is returned. # set, the default declaration is returned.
def xml_decl def xml_decl
rv = @children[0] rv = @children[0]
return rv if rv.kind_of? XMLDecl return rv if rv.kind_of? XMLDecl
rv = @children.unshift(XMLDecl.default)[0] rv = @children.unshift(XMLDecl.default)[0]
end end
# @return the XMLDecl version of this document as a String. # @return the XMLDecl version of this document as a String.
# If no XMLDecl has been set, returns the default version. # If no XMLDecl has been set, returns the default version.
def version def version
xml_decl().version xml_decl().version
end end
# @return the XMLDecl encoding of this document as a String. # @return the XMLDecl encoding of this document as a String.
# If no XMLDecl has been set, returns the default encoding. # If no XMLDecl has been set, returns the default encoding.
def encoding def encoding
xml_decl().encoding xml_decl().encoding
end end
# @return the XMLDecl standalone value of this document as a String. # @return the XMLDecl standalone value of this document as a String.
# If no XMLDecl has been set, returns the default setting. # If no XMLDecl has been set, returns the default setting.
def stand_alone? def stand_alone?
xml_decl().stand_alone? xml_decl().stand_alone?
end end
# Write the XML tree out, optionally with indent. This writes out the # Write the XML tree out, optionally with indent. This writes out the
# entire XML document, including XML declarations, doctype declarations, # entire XML document, including XML declarations, doctype declarations,
# and processing instructions (if any are given). # and processing instructions (if any are given).
# A controversial point is whether Document should always write the XML # A controversial point is whether Document should always write the XML
# declaration (<?xml version='1.0'?>) whether or not one is given by the # declaration (<?xml version='1.0'?>) whether or not one is given by the
# user (or source document). REXML does not write one if one was not # user (or source document). REXML does not write one if one was not
# specified, because it adds unneccessary bandwidth to applications such # specified, because it adds unneccessary bandwidth to applications such
# as XML-RPC. # as XML-RPC.
# #
# #
# output:: # output::
# output an object which supports '<< string'; this is where the # output an object which supports '<< string'; this is where the
# document will be written. # document will be written.
# indent:: # indent::
# An integer. If -1, no indenting will be used; otherwise, the # An integer. If -1, no indenting will be used; otherwise, the
# indentation will be this number of spaces, and children will be # indentation will be this number of spaces, and children will be
# indented an additional amount. Defaults to -1 # indented an additional amount. Defaults to -1
# transitive:: # transitive::
# If transitive is true and indent is >= 0, then the output will be # If transitive is true and indent is >= 0, then the output will be
# pretty-printed in such a way that the added whitespace does not affect # pretty-printed in such a way that the added whitespace does not affect
# the absolute *value* of the document -- that is, it leaves the value # the absolute *value* of the document -- that is, it leaves the value
# and number of Text nodes in the document unchanged. # and number of Text nodes in the document unchanged.
# ie_hack:: # ie_hack::
# Internet Explorer is the worst piece of crap to have ever been # Internet Explorer is the worst piece of crap to have ever been
# written, with the possible exception of Windows itself. Since IE is # written, with the possible exception of Windows itself. Since IE is
# unable to parse proper XML, we have to provide a hack to generate XML # unable to parse proper XML, we have to provide a hack to generate XML
# that IE's limited abilities can handle. This hack inserts a space # that IE's limited abilities can handle. This hack inserts a space
# before the /> on empty tags. Defaults to false # before the /> on empty tags. Defaults to false
def write( output=$stdout, indent_level=-1, transitive=false, ie_hack=false ) def write( output=$stdout, indent_level=-1, transitive=false, ie_hack=false )
output = Output.new( output, xml_decl.encoding ) if xml_decl.encoding != "UTF-8" && !output.kind_of?(Output) output = Output.new( output, xml_decl.encoding ) if xml_decl.encoding != "UTF-8" && !output.kind_of?(Output)
@children.each { |node| @children.each { |node|
indent( output, indent_level ) if node.node_type == :element indent( output, indent_level ) if node.node_type == :element
if node.write( output, indent_level, transitive, ie_hack ) if node.write( output, indent_level, transitive, ie_hack )
output << "\n" unless indent_level<0 or node == @children[-1] output << "\n" unless indent_level<0 or node == @children[-1]
end end
} }
end end
def Document::parse_stream( source, listener ) def Document::parse_stream( source, listener )
Parsers::StreamParser.new( source, listener ).parse Parsers::StreamParser.new( source, listener ).parse
end end
private private
def build( source ) def build( source )
Parsers::TreeParser.new( source, self ).parse Parsers::TreeParser.new( source, self ).parse
end end
end end
end end

Просмотреть файл

@ -6,6 +6,14 @@ require "rexml/xpath"
require "rexml/parseexception" require "rexml/parseexception"
module REXML module REXML
# An implementation note about namespaces:
# As we parse, when we find namespaces we put them in a hash and assign
# them a unique ID. We then convert the namespace prefix for the node
# to the unique ID. This makes namespace lookup much faster for the
# cost of extra memory use. We save the namespace prefix for the
# context node and convert it back when we write it.
@@namespaces = {}
# Represents a tagged XML element. Elements are characterized by # Represents a tagged XML element. Elements are characterized by
# having children, attributes, and names, and can themselves be # having children, attributes, and names, and can themselves be
# children. # children.
@ -91,19 +99,35 @@ module REXML
Element.new self Element.new self
end end
# Evaluates to the root element of the document that this element # Evaluates to the root node of the document that this element
# belongs to. If this element doesn't belong to a document, but does # belongs to. If this element doesn't belong to a document, but does
# belong to another Element, the parent's root will be returned, until the # belong to another Element, the parent's root will be returned, until the
# earliest ancestor is found. # earliest ancestor is found.
#
# Note that this is not the same as the document element.
# In the following example, <a> is the document element, and the root
# node is the parent node of the document element. You may ask yourself
# why the root node is useful: consider the doctype and XML declaration,
# and any processing instructions before the document element... they
# are children of the root node, or siblings of the document element.
# The only time this isn't true is when an Element is created that is
# not part of any Document. In this case, the ancestor that has no
# parent acts as the root node.
# d = Document.new '<a><b><c/></b></a>' # d = Document.new '<a><b><c/></b></a>'
# a = d[1] ; c = a[1][1] # a = d[1] ; c = a[1][1]
# d.root # These all evaluate to the same Element, # d.root_node == d # TRUE
# a.root # namely, <a> # a.root_node # namely, d
# c.root # # c.root_node # again, d
def root def root_node
parent.nil? ? self : parent.root parent.nil? ? self : parent.root_node
end end
def root
return elements[1] if self.kind_of? Document
return self if parent.kind_of? Document or parent.nil?
return parent.root
end
# Evaluates to the document to which this element belongs, or nil if this # Evaluates to the document to which this element belongs, or nil if this
# element doesn't belong to a document. # element doesn't belong to a document.
def document def document
@ -270,7 +294,8 @@ module REXML
# el = doc.add_element 'my-tag', {'attr1'=>'val1', 'attr2'=>'val2'} # el = doc.add_element 'my-tag', {'attr1'=>'val1', 'attr2'=>'val2'}
# el = Element.new 'my-tag' # el = Element.new 'my-tag'
# doc.add_element el # doc.add_element el
def add_element element=nil, attrs=nil def add_element element, attrs=nil
raise "First argument must be either an element name, or an Element object" if element.nil?
el = @elements.add(element) el = @elements.add(element)
if attrs.kind_of? Hash if attrs.kind_of? Hash
attrs.each do |key, value| attrs.each do |key, value|

Просмотреть файл

@ -7,41 +7,33 @@ module REXML
# Therefore, in XML, "local-name()" is identical (and actually becomes) # Therefore, in XML, "local-name()" is identical (and actually becomes)
# "local_name()" # "local_name()"
module Functions module Functions
@@node = nil @@context = nil
@@index = nil
@@size = nil
@@variables = {}
@@namespace_context = {} @@namespace_context = {}
@@variables = {}
def Functions::node=(value); @@node = value; end def Functions::namespace_context=(x) ; @@namespace_context=x ; end
def Functions::index=(value); @@index = value; end def Functions::variables=(x) ; @@variables=x ; end
def Functions::size=(value); @@size = value; end def Functions::namespace_context ; @@namespace_context ; end
def Functions::variables=(value); @@variables = value; end def Functions::variables ; @@variables ; end
def Functions::namespace_context=(value)
@@namespace_context = value def Functions::context=(value); @@context = value; end
end
def Functions::node; @@node; end
def Functions::index; @@index; end
def Functions::size; @@size; end
def Functions::variables; @@variables; end
def Functions::namespace_context; @@namespace_context; end
def Functions::text( ) def Functions::text( )
if @@node.node_type == :element if @@context[:node].node_type == :element
return @@node.text return @@context[:node].find_all{|n| n.node_type == :text}.collect{|n| n.value}
elsif @@node.node_type == :text elsif @@context[:node].node_type == :text
return @@node.value return @@context[:node].value
else else
return false return false
end end
end end
def Functions::last( ) def Functions::last( )
@@size @@context[:size]
end end
def Functions::position( ) def Functions::position( )
@@index @@context[:index]
end end
def Functions::count( node_set ) def Functions::count( node_set )
@ -73,7 +65,7 @@ module REXML
# Helper method. # Helper method.
def Functions::get_namespace( node_set = nil ) def Functions::get_namespace( node_set = nil )
if node_set == nil if node_set == nil
yield @@node if defined? @@node.namespace yield @@context[:node] if defined? @@context[:node].namespace
else else
if node_set.namespace if node_set.namespace
yield node_set yield node_set
@ -214,7 +206,7 @@ module REXML
# UNTESTED # UNTESTED
def Functions::normalize_space( string=nil ) def Functions::normalize_space( string=nil )
string = string(@@node) if string.nil? string = string(@@context[:node]) if string.nil?
if string.kind_of? Array if string.kind_of? Array
string.collect{|x| string.to_s.strip.gsub(/\s+/um, ' ') if string} string.collect{|x| string.to_s.strip.gsub(/\s+/um, ' ') if string}
else else
@ -291,7 +283,7 @@ module REXML
# UNTESTED # UNTESTED
def Functions::lang( language ) def Functions::lang( language )
lang = false lang = false
node = @@node node = @@context[:node]
attr = nil attr = nil
until node.nil? until node.nil?
if node.node_type == :element if node.node_type == :element
@ -325,15 +317,16 @@ module REXML
# an object of a type other than the four basic types is converted to a # an object of a type other than the four basic types is converted to a
# number in a way that is dependent on that type # number in a way that is dependent on that type
def Functions::number( object=nil ) def Functions::number( object=nil )
object = @@node unless object object = @@context[:node] unless object
if object == true case object
when true
Float(1) Float(1)
elsif object == false when false
Float(0) Float(0)
elsif object.kind_of? Array when Array
number(string( object )) number(string( object ))
elsif object.kind_of? Float when Numeric
object object.to_f
else else
str = string( object ) str = string( object )
#puts "STRING OF #{object.inspect} = #{str}" #puts "STRING OF #{object.inspect} = #{str}"
@ -364,9 +357,13 @@ module REXML
end end
end end
def Functions::processing_instruction( node )
node.node_type == :processing_instruction
end
def Functions::method_missing( id ) def Functions::method_missing( id )
puts "METHOD MISSING #{id.id2name}" puts "METHOD MISSING #{id.id2name}"
XPath.match( @@node, id.id2name ) XPath.match( @@context[:node], id.id2name )
end end
end end
end end

Просмотреть файл

@ -58,5 +58,9 @@ module REXML
def node_type def node_type
:processing_instruction :processing_instruction
end end
def inspect
"<?p-i #{target} ...?>"
end
end end
end end

Просмотреть файл

@ -36,5 +36,31 @@ module REXML
def parent? def parent?
false; false;
end end
# Visit all subnodes of +self+ recursively
def each_recursive(&block) # :yields: node
self.elements.each {|node|
block.call(node)
node.each_recursive(&block)
}
end
# Find (and return) first subnode (recursively) for which the block
# evaluates to true. Returns +nil+ if none was found.
def find_first_recursive(&block) # :yields: node
each_recursive {|node|
return node if block.call(node)
}
return nil
end
# Returns the index that +self+ has in its parent's elements array, so that
# the following equation holds true:
#
# node == node.parent.elements[node.index_in_parent]
def index_in_parent
parent.index(self)+1
end
end end
end end

Просмотреть файл

@ -23,13 +23,13 @@ module REXML
# end # end
# #
# Nat Price gave me some good ideas for the API. # Nat Price gave me some good ideas for the API.
class PullParser < BaseParser class PullParser
include XMLTokens include XMLTokens
def initialize stream def initialize stream
super
@entities = {} @entities = {}
@listeners = nil @listeners = nil
@parser = BaseParser.new( stream )
end end
def add_listener( listener ) def add_listener( listener )
@ -44,21 +44,38 @@ module REXML
end end
def peek depth=0 def peek depth=0
PullEvent.new(super) PullEvent.new(@parser.peek(depth))
end end
def has_next?
@parser.has_next?
end
def pull def pull
event = super event = @parser.pull
case event[0] case event[0]
when :entitydecl when :entitydecl
@entities[ event[1] ] = @entities[ event[1] ] =
event[2] unless event[2] =~ /PUBLIC|SYSTEM/ event[2] unless event[2] =~ /PUBLIC|SYSTEM/
when :text when :text
unnormalized = unnormalize( event[1], @entities ) unnormalized = @parser.unnormalize( event[1], @entities )
event << unnormalized event << unnormalized
end end
PullEvent.new( event ) PullEvent.new( event )
end end
def unshift token
@parser.unshift token
end
def entity reference
@parser.entity( reference )
end
def empty?
@parser.empty?
end
end end
# A parsing event. The contents of the event are accessed as an +Array?, # A parsing event. The contents of the event are accessed as an +Array?,
@ -73,44 +90,65 @@ module REXML
def initialize(arg) def initialize(arg)
@contents = arg @contents = arg
end end
def []( index )
@contents[index+1] def []( start, endd=nil)
if start.kind_of? Range
@contents.slice( start.begin+1 .. start.end )
elsif start.kind_of? Numeric
if endd.nil?
@contents.slice( start+1 )
else
@contents.slice( start+1, endd )
end
else
raise "Illegal argument #{start.inspect} (#{start.class})"
end
end end
def event_type def event_type
@contents[0] @contents[0]
end end
# Content: [ String tag_name, Hash attributes ] # Content: [ String tag_name, Hash attributes ]
def start_element? def start_element?
@contents[0] == :start_element @contents[0] == :start_element
end end
# Content: [ String tag_name ] # Content: [ String tag_name ]
def end_element? def end_element?
@contents[0] == :end_element @contents[0] == :end_element
end end
# Content: [ String raw_text, String unnormalized_text ] # Content: [ String raw_text, String unnormalized_text ]
def text? def text?
@contents[0] == :text @contents[0] == :text
end end
# Content: [ String text ] # Content: [ String text ]
def instruction? def instruction?
@contents[0] == :processing_instruction @contents[0] == :processing_instruction
end end
# Content: [ String text ] # Content: [ String text ]
def comment? def comment?
@contents[0] == :comment @contents[0] == :comment
end end
# Content: [ String name, String pub_sys, String long_name, String uri ] # Content: [ String name, String pub_sys, String long_name, String uri ]
def doctype? def doctype?
@contents[0] == :start_doctype @contents[0] == :start_doctype
end end
# Content: [ String text ] # Content: [ String text ]
def attlistdecl? def attlistdecl?
@contents[0] == :attlistdecl @contents[0] == :attlistdecl
end end
# Content: [ String text ] # Content: [ String text ]
def elementdecl? def elementdecl?
@contents[0] == :elementdecl @contents[0] == :elementdecl
end end
# Due to the wonders of DTDs, an entity declaration can be just about # Due to the wonders of DTDs, an entity declaration can be just about
# anything. There's no way to normalize it; you'll have to interpret the # anything. There's no way to normalize it; you'll have to interpret the
# content yourself. However, the following is true: # content yourself. However, the following is true:
@ -121,28 +159,33 @@ module REXML
def entitydecl? def entitydecl?
@contents[0] == :entitydecl @contents[0] == :entitydecl
end end
# Content: [ String text ] # Content: [ String text ]
def notationdecl? def notationdecl?
@contents[0] == :notationdecl @contents[0] == :notationdecl
end end
# Content: [ String text ] # Content: [ String text ]
def entity? def entity?
@contents[0] == :entity @contents[0] == :entity
end end
# Content: [ String text ] # Content: [ String text ]
def cdata? def cdata?
@contents[0] == :cdata @contents[0] == :cdata
end end
# Content: [ String version, String encoding, String standalone ] # Content: [ String version, String encoding, String standalone ]
def xmldecl? def xmldecl?
@contents[0] == :xmldecl @contents[0] == :xmldecl
end end
def error? def error?
@contents[0] == :error @contents[0] == :error
end end
def inspect def inspect
@contents[0].to_s + ": " + @contents[1..-1].inspect @contents[0].to_s + ": " + @contents[1..-1].inspect
end end
end end
end end

Просмотреть файл

@ -12,6 +12,7 @@ module REXML
@namespace_stack = [] @namespace_stack = []
@has_listeners = false @has_listeners = false
@tag_stack = [] @tag_stack = []
@entities = {}
end end
def add_listener( listener ) def add_listener( listener )
@ -143,10 +144,21 @@ module REXML
end end
end end
when :text when :text
normalized = @parser.normalize( event[1] ) #normalized = @parser.normalize( event[1] )
handle( :characters, normalized ) #handle( :characters, normalized )
copy = event[1].clone
@entities.each { |key, value| copy = copy.gsub("&#{key};", value) }
copy.gsub!( Text::NUMERICENTITY ) {|m|
m=$1
m = "0#{m}" if m[0] == ?x
[Integer(m)].pack('U*')
}
handle( :characters, copy )
when :entitydecl
@entities[ event[1] ] = event[2] if event.size == 3
handle( *event )
when :processing_instruction, :comment, :doctype, :attlistdecl, when :processing_instruction, :comment, :doctype, :attlistdecl,
:elementdecl, :entitydecl, :cdata, :notationdecl, :xmldecl :elementdecl, :cdata, :notationdecl, :xmldecl
handle( *event ) handle( *event )
end end
end end

Просмотреть файл

@ -31,9 +31,8 @@ module REXML
@listener.instruction( *event[1,2] ) @listener.instruction( *event[1,2] )
when :start_doctype when :start_doctype
@listener.doctype( *event[1..-1] ) @listener.doctype( *event[1..-1] )
when :notationdecl, :entitydecl, :elementdecl when :comment, :attlistdecl, :notationdecl, :elementdecl,
@listener.notationdecl( event[1..-1] ) :entitydecl, :cdata, :xmldecl, :attlistdecl
when :comment, :attlistdecl, :elementdecl, :cdata, :xmldecl
@listener.send( event[0].to_s, *event[1..-1] ) @listener.send( event[0].to_s, *event[1..-1] )
end end
end end

Просмотреть файл

@ -20,7 +20,7 @@ module REXML
path.gsub!(/([\(\[])\s+/, '\1') # Strip ignorable spaces path.gsub!(/([\(\[])\s+/, '\1') # Strip ignorable spaces
path.gsub!( /\s+([\]\)])/, '\1' ) path.gsub!( /\s+([\]\)])/, '\1' )
parsed = [] parsed = []
path = LocationPath(path, parsed) path = OrExpr(path, parsed)
parsed parsed
end end
@ -302,7 +302,7 @@ module REXML
path = path[1..-1] path = path[1..-1]
end end
parsed << :processing_instruction parsed << :processing_instruction
parsed << literal parsed << (literal || '')
when NCNAMETEST when NCNAMETEST
#puts "NCNAMETEST" #puts "NCNAMETEST"
prefix = $1 prefix = $1
@ -589,9 +589,10 @@ module REXML
when /^(\w[-\w]*)(?:\()/ when /^(\w[-\w]*)(?:\()/
#puts "PrimaryExpr :: Function >>> #$1 -- '#$''" #puts "PrimaryExpr :: Function >>> #$1 -- '#$''"
fname = $1 fname = $1
path = $' tmp = $'
#puts "#{fname} =~ #{NT.inspect}" #puts "#{fname} =~ #{NT.inspect}"
#return nil if fname =~ NT return path if fname =~ NT
path = tmp
parsed << :function parsed << :function
parsed << fname parsed << fname
path = FunctionCall(path, parsed) path = FunctionCall(path, parsed)

Просмотреть файл

@ -10,8 +10,8 @@
# #
# Main page:: http://www.germane-software.com/software/rexml # Main page:: http://www.germane-software.com/software/rexml
# Author:: Sean Russell <serATgermaneHYPHENsoftwareDOTcom> # Author:: Sean Russell <serATgermaneHYPHENsoftwareDOTcom>
# Version:: 3.1.1 # Version:: 3.1.3
# Date:: +2004/162 # Date:: +2005/139
# #
# This API documentation can be downloaded from the REXML home page, or can # This API documentation can be downloaded from the REXML home page, or can
# be accessed online[http://www.germane-software.com/software/rexml_doc] # be accessed online[http://www.germane-software.com/software/rexml_doc]
@ -20,8 +20,7 @@
# or can be accessed # or can be accessed
# online[http://www.germane-software.com/software/rexml/docs/tutorial.html] # online[http://www.germane-software.com/software/rexml/docs/tutorial.html]
module REXML module REXML
Copyright = "Copyright © 2001, 2002, 2003, 2004 Sean Russell <ser@germane-software.com>" Copyright = "Copyright © 2001-2005 Sean Russell <ser@germane-software.com>"
Date = "+2004/186" Date = "+2005/139"
Version = "3.1.2" Version = "3.1.3"
end end

Просмотреть файл

@ -0,0 +1,33 @@
module REXML
class SyncEnumerator
include Enumerable
# Creates a new SyncEnumerator which enumerates rows of given
# Enumerable objects.
def initialize(*enums)
@gens = enums
@biggest = @gens[0]
@gens.each {|x| @biggest = x if x.size > @biggest.size }
end
# Returns the number of enumerated Enumerable objects, i.e. the size
# of each row.
def size
@gens.size
end
# Returns the number of enumerated Enumerable objects, i.e. the size
# of each row.
def length
@gens.length
end
# Enumerates rows of the Enumerable objects.
def each
@biggest.zip( *@gens ) {|a|
yield(*a[1..-1])
}
self
end
end
end

Просмотреть файл

@ -5,180 +5,182 @@ require 'rexml/doctype'
require 'rexml/parseexception' require 'rexml/parseexception'
module REXML module REXML
# Represents text nodes in an XML document # Represents text nodes in an XML document
class Text < Child class Text < Child
include Comparable include Comparable
# The order in which the substitutions occur # The order in which the substitutions occur
SPECIALS = [ /&(?!#?[\w-]+;)/u, /</u, />/u, /"/u, /'/u, /\r/u ] SPECIALS = [ /&(?!#?[\w-]+;)/u, /</u, />/u, /"/u, /'/u, /\r/u ]
SUBSTITUTES = ['&amp;', '&lt;', '&gt;', '&quot;', '&apos;', '&#13;'] SUBSTITUTES = ['&amp;', '&lt;', '&gt;', '&quot;', '&apos;', '&#13;']
# Characters which are substituted in written strings # Characters which are substituted in written strings
SLAICEPS = [ '<', '>', '"', "'", '&' ] SLAICEPS = [ '<', '>', '"', "'", '&' ]
SETUTITSBUS = [ /&lt;/u, /&gt;/u, /&quot;/u, /&apos;/u, /&amp;/u ] SETUTITSBUS = [ /&lt;/u, /&gt;/u, /&quot;/u, /&apos;/u, /&amp;/u ]
# If +raw+ is true, then REXML leaves the value alone # If +raw+ is true, then REXML leaves the value alone
attr_accessor :raw attr_accessor :raw
ILLEGAL = /(<|&(?!(#{Entity::NAME})|(#0*((?:\d+)|(?:x[a-fA-F0-9]+)));))/um ILLEGAL = /(<|&(?!(#{Entity::NAME})|(#0*((?:\d+)|(?:x[a-fA-F0-9]+)));))/um
NUMERICENTITY = /&#0*((?:\d+)|(?:x[a-fA-F0-9]+));/ NUMERICENTITY = /&#0*((?:\d+)|(?:x[a-fA-F0-9]+));/
# Constructor # Constructor
# +arg+ if a String, the content is set to the String. If a Text, # +arg+ if a String, the content is set to the String. If a Text,
# the object is shallowly cloned. # the object is shallowly cloned.
# #
# +respect_whitespace+ (boolean, false) if true, whitespace is # +respect_whitespace+ (boolean, false) if true, whitespace is
# respected # respected
# #
# +parent+ (nil) if this is a Parent object, the parent # +parent+ (nil) if this is a Parent object, the parent
# will be set to this. # will be set to this.
# #
# +raw+ (nil) This argument can be given three values. # +raw+ (nil) This argument can be given three values.
# If true, then the value of used to construct this object is expected to # If true, then the value of used to construct this object is expected to
# contain no unescaped XML markup, and REXML will not change the text. If # contain no unescaped XML markup, and REXML will not change the text. If
# this value is false, the string may contain any characters, and REXML will # this value is false, the string may contain any characters, and REXML will
# escape any and all defined entities whose values are contained in the # escape any and all defined entities whose values are contained in the
# text. If this value is nil (the default), then the raw value of the # text. If this value is nil (the default), then the raw value of the
# parent will be used as the raw value for this node. If there is no raw # parent will be used as the raw value for this node. If there is no raw
# value for the parent, and no value is supplied, the default is false. # value for the parent, and no value is supplied, the default is false.
# Text.new( "<&", false, nil, false ) #-> "&lt;&amp;" # Text.new( "<&", false, nil, false ) #-> "&lt;&amp;"
# Text.new( "<&", false, nil, true ) #-> IllegalArgumentException # Text.new( "<&", false, nil, true ) #-> IllegalArgumentException
# Text.new( "&lt;&amp;", false, nil, true ) #-> "&lt;&amp;" # Text.new( "&lt;&amp;", false, nil, true ) #-> "&lt;&amp;"
# # Assume that the entity "s" is defined to be "sean" # # Assume that the entity "s" is defined to be "sean"
# # and that the entity "r" is defined to be "russell" # # and that the entity "r" is defined to be "russell"
# Text.new( "sean russell" ) #-> "&s; &r;" # Text.new( "sean russell" ) #-> "&s; &r;"
# Text.new( "sean russell", false, nil, true ) #-> "sean russell" # Text.new( "sean russell", false, nil, true ) #-> "sean russell"
# #
# +entity_filter+ (nil) This can be an array of entities to match in the # +entity_filter+ (nil) This can be an array of entities to match in the
# supplied text. This argument is only useful if +raw+ is set to false. # supplied text. This argument is only useful if +raw+ is set to false.
# Text.new( "sean russell", false, nil, false, ["s"] ) #-> "&s; russell" # Text.new( "sean russell", false, nil, false, ["s"] ) #-> "&s; russell"
# Text.new( "sean russell", false, nil, true, ["s"] ) #-> "sean russell" # Text.new( "sean russell", false, nil, true, ["s"] ) #-> "sean russell"
# In the last example, the +entity_filter+ argument is ignored. # In the last example, the +entity_filter+ argument is ignored.
# #
# +pattern+ INTERNAL USE ONLY # +pattern+ INTERNAL USE ONLY
def initialize(arg, respect_whitespace=false, parent=nil, raw=nil, def initialize(arg, respect_whitespace=false, parent=nil, raw=nil,
entity_filter=nil, illegal=ILLEGAL ) entity_filter=nil, illegal=ILLEGAL )
@raw = false @raw = false
if parent if parent
super( parent ) super( parent )
@raw = parent.raw @raw = parent.raw
else else
@parent = nil @parent = nil
end end
@raw = raw unless raw.nil? @raw = raw unless raw.nil?
@entity_filter = entity_filter @entity_filter = entity_filter
@normalized = @unnormalized = nil @normalized = @unnormalized = nil
if arg.kind_of? String if arg.kind_of? String
@string = arg.clone @string = arg.clone
@string.squeeze!(" \n\t") unless respect_whitespace @string.squeeze!(" \n\t") unless respect_whitespace
elsif arg.kind_of? Text elsif arg.kind_of? Text
@string = arg.to_s @string = arg.to_s
@raw = arg.raw @raw = arg.raw
elsif elsif
raise Exception.new( "Illegal argument of type #{arg.type} for Text constructor (#{arg})" ) raise "Illegal argument of type #{arg.type} for Text constructor (#{arg})"
end end
@string.gsub!( /\r\n?/, "\n" ) @string.gsub!( /\r\n?/, "\n" )
# check for illegal characters # check for illegal characters
if @raw if @raw
if @string =~ illegal if @string =~ illegal
raise Exception.new( raise "Illegal character '#{$1}' in raw string \"#{@string}\""
"Illegal character '#{$1}' in raw string \"#{@string}\"" end
) end
end end
end
end
def node_type def node_type
:text :text
end end
def empty? def empty?
@string.size==0 @string.size==0
end end
def clone def clone
return Text.new(self) return Text.new(self)
end end
# Appends text to this text node. The text is appended in the +raw+ mode # Appends text to this text node. The text is appended in the +raw+ mode
# of this text node. # of this text node.
def <<( to_append ) def <<( to_append )
@string << to_append.gsub( /\r\n?/, "\n" ) @string << to_append.gsub( /\r\n?/, "\n" )
end end
# +other+ a String or a Text # +other+ a String or a Text
# +returns+ the result of (to_s <=> arg.to_s) # +returns+ the result of (to_s <=> arg.to_s)
def <=>( other ) def <=>( other )
to_s() <=> other.to_s to_s() <=> other.to_s
end end
REFERENCE = /#{Entity::REFERENCE}/ REFERENCE = /#{Entity::REFERENCE}/
# Returns the string value of this text node. This string is always # Returns the string value of this text node. This string is always
# escaped, meaning that it is a valid XML text node string, and all # escaped, meaning that it is a valid XML text node string, and all
# entities that can be escaped, have been inserted. This method respects # entities that can be escaped, have been inserted. This method respects
# the entity filter set in the constructor. # the entity filter set in the constructor.
# #
# # Assume that the entity "s" is defined to be "sean", and that the # # Assume that the entity "s" is defined to be "sean", and that the
# # entity "r" is defined to be "russell" # # entity "r" is defined to be "russell"
# t = Text.new( "< & sean russell", false, nil, false, ['s'] ) # t = Text.new( "< & sean russell", false, nil, false, ['s'] )
# t.to_s #-> "&lt; &amp; &s; russell" # t.to_s #-> "&lt; &amp; &s; russell"
# t = Text.new( "< & &s; russell", false, nil, false ) # t = Text.new( "< & &s; russell", false, nil, false )
# t.to_s #-> "&lt; &amp; &s; russell" # t.to_s #-> "&lt; &amp; &s; russell"
# u = Text.new( "sean russell", false, nil, true ) # u = Text.new( "sean russell", false, nil, true )
# u.to_s #-> "sean russell" # u.to_s #-> "sean russell"
def to_s def to_s
return @string if @raw return @string if @raw
return @normalized if @normalized return @normalized if @normalized
doctype = nil doctype = nil
if @parent if @parent
doc = @parent.document doc = @parent.document
doctype = doc.doctype if doc doctype = doc.doctype if doc
end end
@normalized = Text::normalize( @string, doctype, @entity_filter ) @normalized = Text::normalize( @string, doctype, @entity_filter )
end end
# Returns the string value of this text. This is the text without def inspect
# entities, as it might be used programmatically, or printed to the @string.inspect
# console. This ignores the 'raw' attribute setting, and any end
# entity_filter.
# # Returns the string value of this text. This is the text without
# # Assume that the entity "s" is defined to be "sean", and that the # entities, as it might be used programmatically, or printed to the
# # entity "r" is defined to be "russell" # console. This ignores the 'raw' attribute setting, and any
# t = Text.new( "< & sean russell", false, nil, false, ['s'] ) # entity_filter.
# t.string #-> "< & sean russell" #
# t = Text.new( "< & &s; russell", false, nil, false ) # # Assume that the entity "s" is defined to be "sean", and that the
# t.string #-> "< & sean russell" # # entity "r" is defined to be "russell"
# u = Text.new( "sean russell", false, nil, true ) # t = Text.new( "< & sean russell", false, nil, false, ['s'] )
# u.string #-> "sean russell" # t.string #-> "< & sean russell"
def value # t = Text.new( "< & &s; russell", false, nil, false )
@unnormalized if @unnormalized # t.string #-> "< & sean russell"
doctype = nil # u = Text.new( "sean russell", false, nil, true )
if @parent # u.string #-> "sean russell"
doc = @parent.document def value
doctype = doc.doctype if doc @unnormalized if @unnormalized
end doctype = nil
@unnormalized = Text::unnormalize( @string, doctype ) if @parent
end doc = @parent.document
doctype = doc.doctype if doc
def wrap(string, width, addnewline=false) end
# Recursivly wrap string at width. @unnormalized = Text::unnormalize( @string, doctype )
return string if string.length <= width end
place = string.rindex(' ', width) # Position in string with last ' ' before cutoff
if addnewline then def wrap(string, width, addnewline=false)
return "\n" + string[0,place] + "\n" + wrap(string[place+1..-1], width) # Recursivly wrap string at width.
else return string if string.length <= width
return string[0,place] + "\n" + wrap(string[place+1..-1], width) place = string.rindex(' ', width) # Position in string with last ' ' before cutoff
end if addnewline then
end return "\n" + string[0,place] + "\n" + wrap(string[place+1..-1], width)
else
return string[0,place] + "\n" + wrap(string[place+1..-1], width)
end
end
# Sets the contents of this text node. This expects the text to be # Sets the contents of this text node. This expects the text to be
# unnormalized. It returns self. # unnormalized. It returns self.
@ -188,26 +190,26 @@ module REXML
# e[0].value = "bar" # <a>bar</a> # e[0].value = "bar" # <a>bar</a>
# e[0].value = "<a>" # <a>&lt;a&gt;</a> # e[0].value = "<a>" # <a>&lt;a&gt;</a>
def value=( val ) def value=( val )
@string = val.gsub( /\r\n?/, "\n" ) @string = val.gsub( /\r\n?/, "\n" )
@unnormalized = nil @unnormalized = nil
@normalized = nil @normalized = nil
@raw = false @raw = false
end end
def indent_text(string, level=1, style="\t", indentfirstline=true) def indent_text(string, level=1, style="\t", indentfirstline=true)
return string if level < 0 return string if level < 0
new_string = '' new_string = ''
string.each { |line| string.each { |line|
indent_string = style * level indent_string = style * level
new_line = (indent_string + line).sub(/[\s]+$/,'') new_line = (indent_string + line).sub(/[\s]+$/,'')
new_string << new_line new_string << new_line
} }
new_string.strip! unless indentfirstline new_string.strip! unless indentfirstline
return new_string return new_string
end end
def write( writer, indent=-1, transitive=false, ie_hack=false ) def write( writer, indent=-1, transitive=false, ie_hack=false )
s = to_s() s = to_s()
if not (@parent and @parent.whitespace) then if not (@parent and @parent.whitespace) then
s = wrap(s, 60, false) if @parent and @parent.context[:wordwrap] == :all s = wrap(s, 60, false) if @parent and @parent.context[:wordwrap] == :all
if @parent and not @parent.context[:indentstyle].nil? and indent > 0 and s.count("\n") > 0 if @parent and not @parent.context[:indentstyle].nil? and indent > 0 and s.count("\n") > 0
@ -216,7 +218,7 @@ module REXML
s.squeeze!(" \n\t") if @parent and !@parent.whitespace s.squeeze!(" \n\t") if @parent and !@parent.whitespace
end end
writer << s writer << s
end end
# FIXME # FIXME
# This probably won't work properly # This probably won't work properly
@ -226,111 +228,111 @@ module REXML
return path return path
end end
# Writes out text, substituting special characters beforehand. # Writes out text, substituting special characters beforehand.
# +out+ A String, IO, or any other object supporting <<( String ) # +out+ A String, IO, or any other object supporting <<( String )
# +input+ the text to substitute and the write out # +input+ the text to substitute and the write out
# #
# z=utf8.unpack("U*") # z=utf8.unpack("U*")
# ascOut="" # ascOut=""
# z.each{|r| # z.each{|r|
# if r < 0x100 # if r < 0x100
# ascOut.concat(r.chr) # ascOut.concat(r.chr)
# else # else
# ascOut.concat(sprintf("&#x%x;", r)) # ascOut.concat(sprintf("&#x%x;", r))
# end # end
# } # }
# puts ascOut # puts ascOut
def write_with_substitution out, input def write_with_substitution out, input
copy = input.clone copy = input.clone
# Doing it like this rather than in a loop improves the speed # Doing it like this rather than in a loop improves the speed
copy.gsub!( SPECIALS[0], SUBSTITUTES[0] ) copy.gsub!( SPECIALS[0], SUBSTITUTES[0] )
copy.gsub!( SPECIALS[1], SUBSTITUTES[1] ) copy.gsub!( SPECIALS[1], SUBSTITUTES[1] )
copy.gsub!( SPECIALS[2], SUBSTITUTES[2] ) copy.gsub!( SPECIALS[2], SUBSTITUTES[2] )
copy.gsub!( SPECIALS[3], SUBSTITUTES[3] ) copy.gsub!( SPECIALS[3], SUBSTITUTES[3] )
copy.gsub!( SPECIALS[4], SUBSTITUTES[4] ) copy.gsub!( SPECIALS[4], SUBSTITUTES[4] )
copy.gsub!( SPECIALS[5], SUBSTITUTES[5] ) copy.gsub!( SPECIALS[5], SUBSTITUTES[5] )
out << copy out << copy
end end
# Reads text, substituting entities # Reads text, substituting entities
def Text::read_with_substitution( input, illegal=nil ) def Text::read_with_substitution( input, illegal=nil )
copy = input.clone copy = input.clone
if copy =~ illegal if copy =~ illegal
raise ParseException.new( "malformed text: Illegal character #$& in \"#{copy}\"" ) raise ParseException.new( "malformed text: Illegal character #$& in \"#{copy}\"" )
end if illegal end if illegal
copy.gsub!( /\r\n?/, "\n" ) copy.gsub!( /\r\n?/, "\n" )
if copy.include? ?& if copy.include? ?&
copy.gsub!( SETUTITSBUS[0], SLAICEPS[0] ) copy.gsub!( SETUTITSBUS[0], SLAICEPS[0] )
copy.gsub!( SETUTITSBUS[1], SLAICEPS[1] ) copy.gsub!( SETUTITSBUS[1], SLAICEPS[1] )
copy.gsub!( SETUTITSBUS[2], SLAICEPS[2] ) copy.gsub!( SETUTITSBUS[2], SLAICEPS[2] )
copy.gsub!( SETUTITSBUS[3], SLAICEPS[3] ) copy.gsub!( SETUTITSBUS[3], SLAICEPS[3] )
copy.gsub!( SETUTITSBUS[4], SLAICEPS[4] ) copy.gsub!( SETUTITSBUS[4], SLAICEPS[4] )
copy.gsub!( /&#0*((?:\d+)|(?:x[a-f0-9]+));/ ) {|m| copy.gsub!( /&#0*((?:\d+)|(?:x[a-f0-9]+));/ ) {|m|
m=$1 m=$1
#m='0' if m=='' #m='0' if m==''
m = "0#{m}" if m[0] == ?x m = "0#{m}" if m[0] == ?x
[Integer(m)].pack('U*') [Integer(m)].pack('U*')
} }
end end
copy copy
end end
EREFERENCE = /&(?!#{Entity::NAME};)/ EREFERENCE = /&(?!#{Entity::NAME};)/
# Escapes all possible entities # Escapes all possible entities
def Text::normalize( input, doctype=nil, entity_filter=nil ) def Text::normalize( input, doctype=nil, entity_filter=nil )
copy = input.clone copy = input.clone
# Doing it like this rather than in a loop improves the speed # Doing it like this rather than in a loop improves the speed
if doctype if doctype
copy = copy.gsub( EREFERENCE, '&amp;' ) copy = copy.gsub( EREFERENCE, '&amp;' )
doctype.entities.each_value do |entity| doctype.entities.each_value do |entity|
copy = copy.gsub( entity.value, copy = copy.gsub( entity.value,
"&#{entity.name};" ) if entity.value and "&#{entity.name};" ) if entity.value and
not( entity_filter and entity_filter.include?(entity) ) not( entity_filter and entity_filter.include?(entity) )
end end
else else
copy = copy.gsub( EREFERENCE, '&amp;' ) copy = copy.gsub( EREFERENCE, '&amp;' )
DocType::DEFAULT_ENTITIES.each_value do |entity| DocType::DEFAULT_ENTITIES.each_value do |entity|
copy = copy.gsub(entity.value, "&#{entity.name};" ) copy = copy.gsub(entity.value, "&#{entity.name};" )
end end
end end
copy copy
end end
# Unescapes all possible entities # Unescapes all possible entities
def Text::unnormalize( string, doctype=nil, filter=nil, illegal=nil ) def Text::unnormalize( string, doctype=nil, filter=nil, illegal=nil )
rv = string.clone rv = string.clone
rv.gsub!( /\r\n?/, "\n" ) rv.gsub!( /\r\n?/, "\n" )
matches = rv.scan( REFERENCE ) matches = rv.scan( REFERENCE )
return rv if matches.size == 0 return rv if matches.size == 0
rv.gsub!( NUMERICENTITY ) {|m| rv.gsub!( NUMERICENTITY ) {|m|
m=$1 m=$1
m = "0#{m}" if m[0] == ?x m = "0#{m}" if m[0] == ?x
[Integer(m)].pack('U*') [Integer(m)].pack('U*')
} }
matches.collect!{|x|x[0]}.compact! matches.collect!{|x|x[0]}.compact!
if matches.size > 0 if matches.size > 0
if doctype if doctype
matches.each do |entity_reference| matches.each do |entity_reference|
unless filter and filter.include?(entity_reference) unless filter and filter.include?(entity_reference)
entity_value = doctype.entity( entity_reference ) entity_value = doctype.entity( entity_reference )
re = /&#{entity_reference};/ re = /&#{entity_reference};/
rv.gsub!( re, entity_value ) if entity_value rv.gsub!( re, entity_value ) if entity_value
end end
end end
else else
matches.each do |entity_reference| matches.each do |entity_reference|
unless filter and filter.include?(entity_reference) unless filter and filter.include?(entity_reference)
entity_value = DocType::DEFAULT_ENTITIES[ entity_reference ] entity_value = DocType::DEFAULT_ENTITIES[ entity_reference ]
re = /&#{entity_reference};/ re = /&#{entity_reference};/
rv.gsub!( re, entity_value.value ) if entity_value rv.gsub!( re, entity_value.value ) if entity_value
end end
end end
end end
rv.gsub!( /&amp;/, '&' ) rv.gsub!( /&amp;/, '&' )
end end
rv rv
end end
end end
end end

Просмотреть файл

@ -94,6 +94,10 @@ module REXML
@writethis = true @writethis = true
end end
def inspect
START.sub(/\\/u, '') + " ... " + STOP.sub(/\\/u, '')
end
private private
def content(enc) def content(enc)
rv = "version='#@version'" rv = "version='#@version'"

Просмотреть файл

@ -2,61 +2,76 @@ require 'rexml/functions'
require 'rexml/xpath_parser' require 'rexml/xpath_parser'
module REXML module REXML
# Wrapper class. Use this class to access the XPath functions. # Wrapper class. Use this class to access the XPath functions.
class XPath class XPath
include Functions include Functions
EMPTY_HASH = {} EMPTY_HASH = {}
# Finds and returns the first node that matches the supplied xpath. # Finds and returns the first node that matches the supplied xpath.
# element:: # element::
# The context element # The context element
# path:: # path::
# The xpath to search for. If not supplied or nil, returns the first # The xpath to search for. If not supplied or nil, returns the first
# node matching '*'. # node matching '*'.
# namespaces:: # namespaces::
# If supplied, a Hash which defines a namespace mapping. # If supplied, a Hash which defines a namespace mapping.
# #
# XPath.first( node ) # XPath.first( node )
# XPath.first( doc, "//b"} ) # XPath.first( doc, "//b"} )
# XPath.first( node, "a/x:b", { "x"=>"http://doofus" } ) # XPath.first( node, "a/x:b", { "x"=>"http://doofus" } )
def XPath::first element, path=nil, namespaces={}, variables={} def XPath::first element, path=nil, namespaces={}, variables={}
parser = XPathParser.new =begin
parser.namespaces = namespaces raise "The namespaces argument, if supplied, must be a hash object." unless namespaces.kind_of? Hash
parser.variables = variables raise "The variables argument, if supplied, must be a hash object." unless variables.kind_of? Hash
path = "*" unless path parser = XPathParser.new
element = [element] unless element.kind_of? Array parser.namespaces = namespaces
parser.parse(path, element)[0] parser.variables = variables
end path = "*" unless path
parser.first( path, element );
=end
#=begin
raise "The namespaces argument, if supplied, must be a hash object." unless namespaces.kind_of? Hash
raise "The variables argument, if supplied, must be a hash object." unless variables.kind_of? Hash
parser = XPathParser.new
parser.namespaces = namespaces
parser.variables = variables
path = "*" unless path
element = [element] unless element.kind_of? Array
parser.parse(path, element).flatten[0]
#=end
end
# Itterates over nodes that match the given path, calling the supplied # Itterates over nodes that match the given path, calling the supplied
# block with the match. # block with the match.
# element:: # element::
# The context element # The context element
# path:: # path::
# The xpath to search for. If not supplied or nil, defaults to '*' # The xpath to search for. If not supplied or nil, defaults to '*'
# namespaces:: # namespaces::
# If supplied, a Hash which defines a namespace mapping # If supplied, a Hash which defines a namespace mapping
# #
# XPath.each( node ) { |el| ... } # XPath.each( node ) { |el| ... }
# XPath.each( node, '/*[@attr='v']' ) { |el| ... } # XPath.each( node, '/*[@attr='v']' ) { |el| ... }
# XPath.each( node, 'ancestor::x' ) { |el| ... } # XPath.each( node, 'ancestor::x' ) { |el| ... }
def XPath::each element, path=nil, namespaces={}, variables={}, &block def XPath::each element, path=nil, namespaces={}, variables={}, &block
parser = XPathParser.new raise "The namespaces argument, if supplied, must be a hash object." unless namespaces.kind_of? Hash
parser.namespaces = namespaces raise "The variables argument, if supplied, must be a hash object." unless variables.kind_of? Hash
parser.variables = variables parser = XPathParser.new
path = "*" unless path parser.namespaces = namespaces
element = [element] unless element.kind_of? Array parser.variables = variables
parser.parse(path, element).each( &block ) path = "*" unless path
end element = [element] unless element.kind_of? Array
parser.parse(path, element).each( &block )
end
# Returns an array of nodes matching a given XPath. # Returns an array of nodes matching a given XPath.
def XPath::match element, path=nil, namespaces={}, variables={} def XPath::match element, path=nil, namespaces={}, variables={}
parser = XPathParser.new parser = XPathParser.new
parser.namespaces = namespaces parser.namespaces = namespaces
parser.variables = variables parser.variables = variables
path = "*" unless path path = "*" unless path
element = [element] unless element.kind_of? Array element = [element] unless element.kind_of? Array
parser.parse(path,element) parser.parse(path,element)
end end
end end
end end

Просмотреть файл

@ -1,7 +1,28 @@
require 'rexml/namespace' require 'rexml/namespace'
require 'rexml/xmltokens' require 'rexml/xmltokens'
require 'rexml/attribute'
require 'rexml/syncenumerator'
require 'rexml/parsers/xpathparser' require 'rexml/parsers/xpathparser'
class Object
def dclone
clone
end
end
class Symbol
def dclone
self
end
end
class Array
def dclone
klone = self.clone
klone.clear
self.each{|v| klone << v.dclone}
klone
end
end
module REXML module REXML
# You don't want to use this class. Really. Use XPath, which is a wrapper # You don't want to use this class. Really. Use XPath, which is a wrapper
# for this class. Believe me. You don't want to poke around in here. # for this class. Believe me. You don't want to poke around in here.
@ -28,259 +49,419 @@ module REXML
end end
def parse path, nodeset def parse path, nodeset
path_stack = @parser.parse( path ) #puts "#"*40
#puts "PARSE: #{path} => #{path_stack.inspect}" path_stack = @parser.parse( path )
#puts "PARSE: nodeset = #{nodeset.collect{|x|x.to_s}.inspect}" #puts "PARSE: #{path} => #{path_stack.inspect}"
match( path_stack, nodeset ) #puts "PARSE: nodeset = #{nodeset.inspect}"
match( path_stack, nodeset )
end
def get_first path, nodeset
#puts "#"*40
path_stack = @parser.parse( path )
#puts "PARSE: #{path} => #{path_stack.inspect}"
#puts "PARSE: nodeset = #{nodeset.inspect}"
first( path_stack, nodeset )
end end
def predicate path, nodeset def predicate path, nodeset
path_stack = @parser.predicate( path ) path_stack = @parser.parse( path )
return Predicate( path_stack, nodeset ) expr( path_stack, nodeset )
end end
def []=( variable_name, value ) def []=( variable_name, value )
@variables[ variable_name ] = value @variables[ variable_name ] = value
end end
def match( path_stack, nodeset )
while ( path_stack.size > 0 and nodeset.size > 0 ) # Performs a depth-first (document order) XPath search, and returns the
#puts "PARSE: #{path_stack.inspect} '#{nodeset.collect{|n|n.class}.inspect}'" # first match. This is the fastest, lightest way to return a single result.
nodeset = internal_parse( path_stack, nodeset ) def first( path_stack, node )
#puts "NODESET: #{nodeset}" #puts "#{depth}) Entering match( #{path.inspect}, #{tree.inspect} )"
#puts "PATH_STACK: #{path_stack.inspect}" return nil if path.size == 0
case path[0]
when :document
# do nothing
return first( path[1..-1], node )
when :child
for c in node.children
#puts "#{depth}) CHILD checking #{name(c)}"
r = first( path[1..-1], c )
#puts "#{depth}) RETURNING #{r.inspect}" if r
return r if r
end
when :qname
name = path[2]
#puts "#{depth}) QNAME #{name(tree)} == #{name} (path => #{path.size})"
if node.name == name
#puts "#{depth}) RETURNING #{tree.inspect}" if path.size == 3
return node if path.size == 3
return first( path[3..-1], node )
else
return nil
end
when :descendant_or_self
r = first( path[1..-1], node )
return r if r
for c in node.children
r = first( path, c )
return r if r
end
when :node
return first( path[1..-1], node )
when :any
return first( path[1..-1], node )
end end
nodeset return nil
end
def match( path_stack, nodeset )
#puts "MATCH: path_stack = #{path_stack.inspect}"
#puts "MATCH: nodeset = #{nodeset.inspect}"
r = expr( path_stack, nodeset )
#puts "MAIN EXPR => #{r.inspect}"
r
#while ( path_stack.size > 0 and nodeset.size > 0 )
# #puts "MATCH: #{path_stack.inspect} '#{nodeset.collect{|n|n.class}.inspect}'"
# nodeset = expr( path_stack, nodeset )
# #puts "NODESET: #{nodeset.inspect}"
# #puts "PATH_STACK: #{path_stack.inspect}"
#end
#nodeset
end end
private private
def internal_parse path_stack, nodeset
#puts "INTERNAL_PARSE RETURNING WITH NO RESULTS" if nodeset.size == 0 or path_stack.size == 0
return nodeset if nodeset.size == 0 or path_stack.size == 0
#puts "INTERNAL_PARSE: #{path_stack.inspect}, #{nodeset.collect{|n| n.class}.inspect}"
case path_stack.shift
when :document
return [ nodeset[0].root.parent ]
when :qname # Expr takes a stack of path elements and a set of nodes (either a Parent
prefix = path_stack.shift # or an Array and returns an Array of matching nodes
name = path_stack.shift ALL = [ :attribute, :element, :text, :processing_instruction, :comment ]
#puts "QNAME #{prefix}#{prefix.size>0?':':''}#{name}" ELEMENTS = [ :element ]
n = nodeset.clone def expr( path_stack, nodeset, context=nil )
ns = @namespaces[prefix] #puts "#"*15
ns = ns ? ns : '' #puts "In expr with #{path_stack.inspect}"
n.delete_if do |node| #puts "Returning" if path_stack.length == 0 || nodeset.length == 0
# FIXME: This DOUBLES the time XPath searches take node_types = ELEMENTS
ns = node.namespace( prefix ) if node.node_type == :element and ns == '' return nodeset if path_stack.length == 0 || nodeset.length == 0
#puts "NODE: '#{node.to_s}'; node.has_name?( #{name.inspect}, #{ns.inspect} ): #{ node.has_name?( name, ns )}; node.namespace() = #{node.namespace().inspect}; node.prefix = #{node.prefix().inspect}" if node.node_type == :element while path_stack.length > 0
!(node.node_type == :element and node.name == name and node.namespace == ns ) #puts "Path stack = #{path_stack.inspect}"
end #puts "Nodeset is #{nodeset.inspect}"
return n case (op = path_stack.shift)
when :document
nodeset = [ nodeset[0].root_node ]
#puts ":document, nodeset = #{nodeset.inspect}"
when :any
n = nodeset.clone
n.delete_if { |node| node.node_type != :element }
return n
when :self
# THIS SPACE LEFT INTENTIONALLY BLANK
when :processing_instruction
target = path_stack.shift
n = nodeset.clone
n.delete_if do |node|
(node.node_type != :processing_instruction) or
( !target.nil? and ( node.target != target ) )
end
return n
when :text
#puts ":TEXT"
n = nodeset.clone
n.delete_if do |node|
#puts "#{node} :: #{node.node_type}"
node.node_type != :text
end
return n
when :comment
n = nodeset.clone
n.delete_if do |node|
node.node_type != :comment
end
return n
when :node
return nodeset
# FIXME: I suspect the following XPath will fail:
# /a/*/*[1]
when :child
#puts "CHILD"
new_nodeset = []
nt = nil
for node in nodeset
nt = node.node_type
new_nodeset += node.children if nt == :element or nt == :document
end
#path_stack[0,(path_stack.size-ps_clone.size)] = []
return new_nodeset
when :literal
literal = path_stack.shift
if literal =~ /^\d+(\.\d+)?$/
return ($1 ? literal.to_f : literal.to_i)
end
#puts "RETURNING '#{literal}'"
return literal
when :attribute
new_nodeset = []
case path_stack.shift
when :qname when :qname
#puts "IN QNAME"
prefix = path_stack.shift prefix = path_stack.shift
name = path_stack.shift name = path_stack.shift
for element in nodeset ns = @namespaces[prefix]
if element.node_type == :element ns = ns ? ns : ''
#puts element.name nodeset.delete_if do |node|
#puts "looking for attribute #{name} in '#{@namespaces[prefix]}'" # FIXME: This DOUBLES the time XPath searches take
attr = element.attribute( name, @namespaces[prefix] ) ns = node.namespace( prefix ) if node.node_type == :element and ns == ''
#puts ":ATTRIBUTE: attr => #{attr}" #puts "NS = #{ns.inspect}"
new_nodeset << attr if attr #puts "node.node_type == :element => #{node.node_type == :element}"
if node.node_type == :element
#puts "node.name == #{name} => #{node.name == name}"
if node.name == name
#puts "node.namespace == #{ns.inspect} => #{node.namespace == ns}"
end
end end
!(node.node_type == :element and
node.name == name and
node.namespace == ns )
end end
node_types = ELEMENTS
when :any when :any
#puts "ANY" #puts "ANY 1: nodeset = #{nodeset.inspect}"
for element in nodeset #puts "ANY 1: node_types = #{node_types.inspect}"
if element.node_type == :element nodeset.delete_if { |node| !node_types.include?(node.node_type) }
new_nodeset += element.attributes.to_a #puts "ANY 2: nodeset = #{nodeset.inspect}"
when :self
# This space left intentionally blank
when :processing_instruction
target = path_stack.shift
nodeset.delete_if do |node|
(node.node_type != :processing_instruction) or
( target!='' and ( node.target != target ) )
end
when :text
nodeset.delete_if { |node| node.node_type != :text }
when :comment
nodeset.delete_if { |node| node.node_type != :comment }
when :node
# This space left intentionally blank
node_types = ALL
when :child
new_nodeset = []
nt = nil
for node in nodeset
nt = node.node_type
new_nodeset += node.children if nt == :element or nt == :document
end
nodeset = new_nodeset
node_types = ELEMENTS
when :literal
literal = path_stack.shift
if literal =~ /^\d+(\.\d+)?$/
return ($1 ? literal.to_f : literal.to_i)
end
return literal
when :attribute
new_nodeset = []
case path_stack.shift
when :qname
prefix = path_stack.shift
name = path_stack.shift
for element in nodeset
if element.node_type == :element
#puts element.name
attr = element.attribute( name, @namespaces[prefix] )
new_nodeset << attr if attr
end
end
when :any
#puts "ANY"
for element in nodeset
if element.node_type == :element
new_nodeset += element.attributes.to_a
end
end end
end end
end nodeset = new_nodeset
#puts "RETURNING #{new_nodeset.collect{|n|n.to_s}.inspect}"
return new_nodeset
when :parent when :parent
return internal_parse( path_stack, nodeset.collect{|n| n.parent}.compact ) #puts "PARENT 1: nodeset = #{nodeset}"
nodeset = nodeset.collect{|n| n.parent}.compact
#nodeset = expr(path_stack.dclone, nodeset.collect{|n| n.parent}.compact)
#puts "PARENT 2: nodeset = #{nodeset.inspect}"
node_types = ELEMENTS
when :ancestor when :ancestor
#puts "ANCESTOR" new_nodeset = []
new_nodeset = [] for node in nodeset
for node in nodeset while node.parent
while node.parent
node = node.parent
new_nodeset << node unless new_nodeset.include? node
end
end
#nodeset = new_nodeset.uniq
return new_nodeset
when :ancestor_or_self
new_nodeset = []
for node in nodeset
if node.node_type == :element
new_nodeset << node
while ( node.parent )
node = node.parent node = node.parent
new_nodeset << node unless new_nodeset.include? node new_nodeset << node unless new_nodeset.include? node
end end
end end
end nodeset = new_nodeset
#nodeset = new_nodeset.uniq node_types = ELEMENTS
return new_nodeset
when :predicate when :ancestor_or_self
#puts "@"*80 new_nodeset = []
#puts "NODESET = #{nodeset.collect{|n|n.to_s}.inspect}" for node in nodeset
predicate = path_stack.shift if node.node_type == :element
new_nodeset = [] new_nodeset << node
Functions::size = nodeset.size while ( node.parent )
nodeset.size.times do |index| node = node.parent
node = nodeset[index] new_nodeset << node unless new_nodeset.include? node
Functions::node = node end
Functions::index = index+1 end
#puts "Node #{node} and index=#{index+1}"
result = Predicate( predicate, node )
#puts "Predicate returned #{result} (#{result.class}) for #{node.class}"
if result.kind_of? Numeric
#puts "#{result} == #{index} => #{result == index}"
new_nodeset << node if result == (index+1)
elsif result.instance_of? Array
new_nodeset << node if result.size > 0
else
new_nodeset << node if result
end end
nodeset = new_nodeset
node_types = ELEMENTS
when :predicate
new_nodeset = []
subcontext = { :size => nodeset.size }
pred = path_stack.shift
nodeset.each_with_index { |node, index|
subcontext[ :node ] = node
#puts "PREDICATE SETTING CONTEXT INDEX TO #{index+1}"
subcontext[ :index ] = index+1
pc = pred.dclone
#puts "#{node.hash}) Recursing with #{pred.inspect} and [#{node.inspect}]"
result = expr( pc, [node], subcontext )
result = result[0] if result.kind_of? Array and result.length == 1
#puts "#{node.hash}) Result = #{result.inspect} (#{result.class.name})"
if result.kind_of? Numeric
#puts "Adding node #{node.inspect}" if result == (index+1)
new_nodeset << node if result == (index+1)
elsif result.instance_of? Array
#puts "Adding node #{node.inspect}" if result.size > 0
new_nodeset << node if result.size > 0
else
#puts "Adding node #{node.inspect}" if result
new_nodeset << node if result
end
}
#puts "New nodeset = #{new_nodeset.inspect}"
#puts "Path_stack = #{path_stack.inspect}"
nodeset = new_nodeset
=begin
predicate = path_stack.shift
ns = nodeset.clone
result = expr( predicate, ns )
#puts "Result = #{result.inspect} (#{result.class.name})"
#puts "nodeset = #{nodeset.inspect}"
if result.kind_of? Array
nodeset = result.zip(ns).collect{|m,n| n if m}.compact
else
nodeset = result ? nodeset : []
end
#puts "Outgoing NS = #{nodeset.inspect}"
=end
when :descendant_or_self
rv = descendant_or_self( path_stack, nodeset )
path_stack.clear
nodeset = rv
node_types = ELEMENTS
when :descendant
results = []
nt = nil
for node in nodeset
nt = node.node_type
results += expr( path_stack.dclone.unshift( :descendant_or_self ),
node.children ) if nt == :element or nt == :document
end
nodeset = results
node_types = ELEMENTS
when :following_sibling
#puts "FOLLOWING_SIBLING 1: nodeset = #{nodeset}"
results = []
for node in nodeset
all_siblings = node.parent.children
current_index = all_siblings.index( node )
following_siblings = all_siblings[ current_index+1 .. -1 ]
results += expr( path_stack.dclone, following_siblings )
end
#puts "FOLLOWING_SIBLING 2: nodeset = #{nodeset}"
nodeset = results
when :preceding_sibling
results = []
for node in nodeset
all_siblings = node.parent.children
current_index = all_siblings.index( node )
preceding_siblings = all_siblings[ 0 .. current_index-1 ].reverse
#results += expr( path_stack.dclone, preceding_siblings )
end
nodeset = preceding_siblings
node_types = ELEMENTS
when :preceding
new_nodeset = []
for node in nodeset
new_nodeset += preceding( node )
end
#puts "NEW NODESET => #{new_nodeset.inspect}"
nodeset = new_nodeset
node_types = ELEMENTS
when :following
new_nodeset = []
for node in nodeset
new_nodeset += following( node )
end
nodeset = new_nodeset
node_types = ELEMENTS
when :namespace
new_set = []
for node in nodeset
new_nodeset << node.namespace if node.node_type == :element or node.node_type == :attribute
end
nodeset = new_nodeset
when :variable
var_name = path_stack.shift
return @variables[ var_name ]
# :and, :or, :eq, :neq, :lt, :lteq, :gt, :gteq
when :eq, :neq, :lt, :lteq, :gt, :gteq, :and, :or
left = expr( path_stack.shift, nodeset, context )
#puts "LEFT => #{left.inspect} (#{left.class.name})"
right = expr( path_stack.shift, nodeset, context )
#puts "RIGHT => #{right.inspect} (#{right.class.name})"
res = equality_relational_compare( left, op, right )
#puts "RES => #{res.inspect}"
return res
when :div
left = Functions::number(expr(path_stack.shift, nodeset, context)).to_f
right = Functions::number(expr(path_stack.shift, nodeset, context)).to_f
return (left / right)
when :mod
left = Functions::number(expr(path_stack.shift, nodeset, context )).to_f
right = Functions::number(expr(path_stack.shift, nodeset, context )).to_f
return (left % right)
when :mult
left = Functions::number(expr(path_stack.shift, nodeset, context )).to_f
right = Functions::number(expr(path_stack.shift, nodeset, context )).to_f
return (left * right)
when :plus
left = Functions::number(expr(path_stack.shift, nodeset, context )).to_f
right = Functions::number(expr(path_stack.shift, nodeset, context )).to_f
return (left + right)
when :minus
left = Functions::number(expr(path_stack.shift, nodeset, context )).to_f
right = Functions::number(expr(path_stack.shift, nodeset, context )).to_f
return (left - right)
when :union
left = expr( path_stack.shift, nodeset, context )
right = expr( path_stack.shift, nodeset, context )
return (left | right)
when :neg
res = expr( path_stack, nodeset, context )
return -(res.to_f)
when :not
when :function
func_name = path_stack.shift.tr('-','_')
arguments = path_stack.shift
#puts "FUNCTION 0: #{func_name}(#{arguments.collect{|a|a.inspect}.join(', ')})"
subcontext = context ? nil : { :size => nodeset.size }
res = []
cont = context
nodeset.each_with_index { |n, i|
if subcontext
subcontext[:node] = n
subcontext[:index] = i
cont = subcontext
end
arg_clone = arguments.dclone
args = arg_clone.collect { |arg|
#puts "FUNCTION 1: Calling expr( #{arg.inspect}, [#{n.inspect}] )"
expr( arg, [n], cont )
}
#puts "FUNCTION 2: #{func_name}(#{args.collect{|a|a.inspect}.join(', ')})"
Functions.context = cont
res << Functions.send( func_name, *args )
#puts "FUNCTION 3: #{res[-1].inspect}"
}
return res
end end
#puts "Nodeset after predicate #{predicate.inspect} has #{new_nodeset.size} nodes" end # while
#puts "NODESET: #{new_nodeset.collect{|n|n.to_s}.inspect}" #puts "EXPR returning #{nodeset.inspect}"
return new_nodeset return nodeset
when :descendant_or_self
rv = descendant_or_self( path_stack, nodeset )
path_stack.clear
return rv
when :descendant
#puts ":DESCENDANT"
results = []
nt = nil
for node in nodeset
nt = node.node_type
results += internal_parse( path_stack.clone.unshift( :descendant_or_self ),
node.children ) if nt == :element or nt == :document
end
return results
when :following_sibling
results = []
for node in nodeset
all_siblings = node.parent.children
current_index = all_siblings.index( node )
following_siblings = all_siblings[ current_index+1 .. -1 ]
results += internal_parse( path_stack.clone, following_siblings )
end
return results
when :preceding_sibling
results = []
for node in nodeset
all_siblings = node.parent.children
current_index = all_siblings.index( node )
preceding_siblings = all_siblings[ 0 .. current_index-1 ]
results += internal_parse( path_stack.clone, preceding_siblings )
end
return results
when :preceding
new_nodeset = []
for node in nodeset
new_nodeset += preceding( node )
end
return new_nodeset
when :following
new_nodeset = []
for node in nodeset
new_nodeset += following( node )
end
return new_nodeset
when :namespace
new_set = []
for node in nodeset
new_nodeset << node.namespace if node.node_type == :element or node.node_type == :attribute
end
return new_nodeset
when :variable
var_name = path_stack.shift
return @variables[ var_name ]
end
nodeset
end end
########################################################## ##########################################################
# FIXME # FIXME
# The next two methods are BAD MOJO! # The next two methods are BAD MOJO!
@ -294,13 +475,16 @@ module REXML
d_o_s( path_stack, nodeset, rs ) d_o_s( path_stack, nodeset, rs )
#puts "RS = #{rs.collect{|n|n.to_s}.inspect}" #puts "RS = #{rs.collect{|n|n.to_s}.inspect}"
document_order(rs.flatten.compact) document_order(rs.flatten.compact)
#rs.flatten.compact
end end
def d_o_s( p, ns, r ) def d_o_s( p, ns, r )
#puts "IN DOS with #{ns.inspect}; ALREADY HAVE #{r.inspect}"
nt = nil nt = nil
ns.each_index do |i| ns.each_index do |i|
n = ns[i] n = ns[i]
x = match( p.clone, [ n ] ) #puts "P => #{p.inspect}"
x = expr( p.dclone, [ n ] )
nt = n.node_type nt = n.node_type
d_o_s( p, n.children, x ) if nt == :element or nt == :document and n.children.size > 0 d_o_s( p, n.children, x ) if nt == :element or nt == :document and n.children.size > 0
r.concat(x) if x.size > 0 r.concat(x) if x.size > 0
@ -310,6 +494,12 @@ module REXML
# Reorders an array of nodes so that they are in document order # Reorders an array of nodes so that they are in document order
# It tries to do this efficiently. # It tries to do this efficiently.
#
# FIXME: I need to get rid of this, but the issue is that most of the XPath
# interpreter functions as a filter, which means that we lose context going
# in and out of function calls. If I knew what the index of the nodes was,
# I wouldn't have to do this. Maybe add a document IDX for each node?
# Problems with mutable documents. Or, rewrite everything.
def document_order( array_of_nodes ) def document_order( array_of_nodes )
new_arry = [] new_arry = []
array_of_nodes.each { |node| array_of_nodes.each { |node|
@ -319,8 +509,9 @@ module REXML
node_idx << np.parent.index( np ) node_idx << np.parent.index( np )
np = np.parent np = np.parent
end end
new_arry << [ node_idx.reverse.join, node ] new_arry << [ node_idx.reverse, node ]
} }
#puts "new_arry = #{new_arry.inspect}"
new_arry.sort{ |s1, s2| s1[0] <=> s2[0] }.collect{ |s| s[1] } new_arry.sort{ |s1, s2| s1[0] <=> s2[0] }.collect{ |s| s[1] }
end end
@ -333,124 +524,127 @@ module REXML
end end
# Given a predicate, a node, and a context, evaluates to true or false.
def Predicate( predicate, node )
predicate = predicate.clone
#puts "#"*20
#puts "Predicate( #{predicate.inspect}, #{node.class} )"
results = []
case (predicate[0])
when :and, :or, :eq, :neq, :lt, :lteq, :gt, :gteq
eq = predicate.shift
left = Predicate( predicate.shift, node )
right = Predicate( predicate.shift, node )
#puts "LEFT = #{left.inspect}"
#puts "RIGHT = #{right.inspect}"
return equality_relational_compare( left, eq, right )
when :div, :mod, :mult, :plus, :minus
op = predicate.shift
left = Predicate( predicate.shift, node )
right = Predicate( predicate.shift, node )
#puts "LEFT = #{left.inspect}"
#puts "RIGHT = #{right.inspect}"
left = Functions::number( left )
right = Functions::number( right )
#puts "LEFT = #{left.inspect}"
#puts "RIGHT = #{right.inspect}"
case op
when :div
return left.to_f / right.to_f
when :mod
return left % right
when :mult
return left * right
when :plus
return left + right
when :minus
return left - right
end
when :union
predicate.shift
left = Predicate( predicate.shift, node )
right = Predicate( predicate.shift, node )
return (left | right)
when :neg
predicate.shift
operand = Functions::number(Predicate( predicate, node ))
return -operand
when :not
predicate.shift
return !Predicate( predicate.shift, node )
when :function
predicate.shift
func_name = predicate.shift.tr('-', '_')
arguments = predicate.shift
#puts "\nFUNCTION: #{func_name}"
#puts "ARGUMENTS: #{arguments.inspect} #{node.to_s}"
args = arguments.collect { |arg| Predicate( arg, node ) }
#puts "FUNCTION: #{func_name}( #{args.collect{|n|n.to_s}.inspect} )"
result = Functions.send( func_name, *args )
#puts "RESULTS: #{result.inspect}"
return result
else
return match( predicate, [ node ] )
end
end
# Builds a nodeset of all of the following nodes of the supplied node,
# in document order
def following( node )
all_siblings = node.parent.children
current_index = all_siblings.index( node )
following_siblings = all_siblings[ current_index+1 .. -1 ]
following = []
recurse( following_siblings ) { |node| following << node }
following.shift
#puts "following is returning #{puta following}"
following
end
# Builds a nodeset of all of the preceding nodes of the supplied node, # Builds a nodeset of all of the preceding nodes of the supplied node,
# in reverse document order # in reverse document order
# preceding:: includes every element in the document that precedes this node,
# except for ancestors
def preceding( node ) def preceding( node )
all_siblings = node.parent.children #puts "IN PRECEDING"
current_index = all_siblings.index( node ) ancestors = []
preceding_siblings = all_siblings[ 0 .. current_index-1 ] p = node.parent
while p
ancestors << p
p = p.parent
end
preceding = [] acc = []
recurse( preceding_siblings ) { |node| preceding.unshift( node ) } p = preceding_node_of( node )
preceding #puts "P = #{p.inspect}"
while p
if ancestors.include? p
ancestors.delete(p)
else
acc << p
end
p = preceding_node_of( p )
#puts "P = #{p.inspect}"
end
acc
end
def preceding_node_of( node )
#puts "NODE: #{node.inspect}"
#puts "PREVIOUS NODE: #{node.previous_sibling_node.inspect}"
#puts "PARENT NODE: #{node.parent}"
psn = node.previous_sibling_node
if psn.nil?
if node.parent.nil? or node.parent.class == Document
return nil
end
return node.parent
#psn = preceding_node_of( node.parent )
end
while psn and psn.kind_of? Element and psn.children.size > 0
psn = psn.children[-1]
end
psn
end
def following( node )
#puts "IN PRECEDING"
acc = []
p = next_sibling_node( node )
#puts "P = #{p.inspect}"
while p
acc << p
p = following_node_of( p )
#puts "P = #{p.inspect}"
end
acc
end
def following_node_of( node )
#puts "NODE: #{node.inspect}"
#puts "PREVIOUS NODE: #{node.previous_sibling_node.inspect}"
#puts "PARENT NODE: #{node.parent}"
if node.kind_of? Element and node.children.size > 0
return node.children[0]
end
return next_sibling_node(node)
end
def next_sibling_node(node)
psn = node.next_sibling_node
while psn.nil?
if node.parent.nil? or node.parent.class == Document
return nil
end
node = node.parent
psn = node.next_sibling_node
#puts "psn = #{psn.inspect}"
end
return psn
end
def norm b
case b
when true, false
return b
when 'true', 'false'
return Functions::boolean( b )
when /^\d+(\.\d+)?$/
return Functions::number( b )
else
return Functions::string( b )
end
end end
def equality_relational_compare( set1, op, set2 ) def equality_relational_compare( set1, op, set2 )
#puts "#"*80 #puts "EQ_REL_COMP(#{set1.inspect} #{op.inspect} #{set2.inspect})"
if set1.kind_of? Array and set2.kind_of? Array if set1.kind_of? Array and set2.kind_of? Array
#puts "#{set1.size} & #{set2.size}" #puts "#{set1.size} & #{set2.size}"
if set1.size == 1 and set2.size == 1 if set1.size == 1 and set2.size == 1
set1 = set1[0] set1 = set1[0]
set2 = set2[0] set2 = set2[0]
elsif set1.size == 0 or set2.size == 0 elsif set1.size == 0 or set2.size == 0
nd = set1.size==0 ? set2 : set1 nd = set1.size==0 ? set2 : set1
nd.each { |il| return true if compare( il, op, nil ) } rv = nd.collect { |il| compare( il, op, nil ) }
#puts "RV = #{rv.inspect}"
return rv
else else
set1.each do |i1| res = []
i1 = i1.to_s enum = SyncEnumerator.new( set1, set2 ).each { |i1, i2|
set2.each do |i2| #puts "i1 = #{i1.inspect} (#{i1.class.name})"
i2 = i2.to_s #puts "i2 = #{i2.inspect} (#{i2.class.name})"
return true if compare( i1, op, i2 ) i1 = norm( i1 )
end i2 = norm( i2 )
end res << compare( i1, op, i2 )
return false }
return res
end end
end end
#puts "EQ_REL_COMP: #{set1.class.name} #{set1.inspect}, #{op}, #{set2.class.name} #{set2.inspect}" #puts "EQ_REL_COMP: #{set1.inspect} (#{set1.class.name}), #{op}, #{set2.inspect} (#{set2.class.name})"
#puts "COMPARING VALUES" #puts "COMPARING VALUES"
# If one is nodeset and other is number, compare number to each item # If one is nodeset and other is number, compare number to each item
# in nodeset s.t. number op number(string(item)) # in nodeset s.t. number op number(string(item))
@ -459,40 +653,28 @@ module REXML
# If one is nodeset and other is boolean, compare boolean to each item # If one is nodeset and other is boolean, compare boolean to each item
# in nodeset s.t. boolean op boolean(item) # in nodeset s.t. boolean op boolean(item)
if set1.kind_of? Array or set2.kind_of? Array if set1.kind_of? Array or set2.kind_of? Array
#puts "ISA ARRAY" #puts "ISA ARRAY"
if set1.kind_of? Array if set1.kind_of? Array
a = set1 a = set1
b = set2.to_s b = set2
else else
a = set2 a = set2
b = set1.to_s b = set1
end end
case b case b
when 'true', 'false' when true, false
b = Functions::boolean( b ) return a.collect {|v| compare( Functions::boolean(v), op, b ) }
for v in a when Numeric
v = Functions::boolean(v) return a.collect {|v| compare( Functions::number(v), op, b )}
return true if compare( v, op, b )
end
when /^\d+(\.\d+)?$/ when /^\d+(\.\d+)?$/
b = Functions::number( b ) b = Functions::number( b )
#puts "B = #{b.inspect}" #puts "B = #{b.inspect}"
for v in a return a.collect {|v| compare( Functions::number(v), op, b )}
#puts "v = #{v.inspect}"
v = Functions::number(v)
#puts "v = #{v.inspect}"
#puts compare(v,op,b)
return true if compare( v, op, b )
end
else else
#puts "Functions::string( #{b}(#{b.class.name}) ) = #{Functions::string(b)}" #puts "Functions::string( #{b}(#{b.class.name}) ) = #{Functions::string(b)}"
b = Functions::string( b ) b = Functions::string( b )
for v in a return a.collect { |v| compare( Functions::string(v), op, b ) }
#puts "v = #{v.class.name} #{v.inspect}"
v = Functions::string(v)
return true if compare( v, op, b )
end
end end
else else
# If neither is nodeset, # If neither is nodeset,
@ -532,7 +714,7 @@ module REXML
end end
def compare a, op, b def compare a, op, b
#puts "COMPARE #{a.to_s}(#{a.class.name}) #{op} #{b.to_s}(#{a.class.name})" #puts "COMPARE #{a.inspect}(#{a.class.name}) #{op} #{b.inspect}(#{b.class.name})"
case op case op
when :eq when :eq
a == b a == b