I think it's debatable which is the most common usage of
`FileUtils.mkdir_p`, but even assuming the most common use case is
creating a folder when it doesn't previously exist but the parent does,
this optimization doesn't seem to have a noticiable effect there while
harming other use cases.
For benchmarks, I created this script
```ruby
require "benchmark/ips"
Benchmark.ips do |x|
x.report("old mkdir_p - exists") do
FileUtils.mkdir_p "/tmp"
end
x.report("new_mkdir_p - exists") do
FileUtils.mkdir_p_new "/tmp"
end
x.compare!
end
FileUtils.rm_rf "/tmp/foo"
Benchmark.ips do |x|
x.report("old mkdir_p - doesnt exist, parent exists") do
FileUtils.mkdir_p "/tmp/foo"
FileUtils.rm_rf "/tmp/foo"
end
x.report("new_mkdir_p - doesnt exist, parent exists") do
FileUtils.mkdir_p_new "/tmp/foo"
FileUtils.rm_rf "/tmp/foo"
end
x.compare!
end
Benchmark.ips do |x|
x.report("old mkdir_p - doesnt exist, parent either") do
FileUtils.mkdir_p "/tmp/foo/bar"
FileUtils.rm_rf "/tmp/foo"
end
x.report("new_mkdir_p - doesnt exist, parent either") do
FileUtils.mkdir_p_new "/tmp/foo/bar"
FileUtils.rm_rf "/tmp/foo"
end
x.compare!
end
Benchmark.ips do |x|
x.report("old mkdir_p - more levels") do
FileUtils.mkdir_p "/tmp/foo/bar/baz"
FileUtils.rm_rf "/tmp/foo"
end
x.report("new_mkdir_p - more levels") do
FileUtils.mkdir_p_new "/tmp/foo/bar/baz"
FileUtils.rm_rf "/tmp/foo"
end
x.compare!
end
```
and copied the method with the "optimization" removed as
`FileUtils.mkdir_p_new`. The results are as below:
```
Warming up --------------------------------------
old mkdir_p - exists 15.914k i/100ms
new_mkdir_p - exists 46.512k i/100ms
Calculating -------------------------------------
old mkdir_p - exists 161.461k (± 3.2%) i/s - 811.614k in 5.032315s
new_mkdir_p - exists 468.192k (± 2.9%) i/s - 2.372M in 5.071225s
Comparison:
new_mkdir_p - exists: 468192.1 i/s
old mkdir_p - exists: 161461.0 i/s - 2.90x (± 0.00) slower
Warming up --------------------------------------
old mkdir_p - doesnt exist, parent exists
2.142k i/100ms
new_mkdir_p - doesnt exist, parent exists
1.961k i/100ms
Calculating -------------------------------------
old mkdir_p - doesnt exist, parent exists
21.242k (± 6.7%) i/s - 107.100k in 5.069206s
new_mkdir_p - doesnt exist, parent exists
19.682k (± 4.2%) i/s - 100.011k in 5.091961s
Comparison:
old mkdir_p - doesnt exist, parent exists: 21241.7 i/s
new_mkdir_p - doesnt exist, parent exists: 19681.7 i/s - same-ish: difference falls within error
Warming up --------------------------------------
old mkdir_p - doesnt exist, parent either
945.000 i/100ms
new_mkdir_p - doesnt exist, parent either
1.002k i/100ms
Calculating -------------------------------------
old mkdir_p - doesnt exist, parent either
9.689k (± 4.4%) i/s - 49.140k in 5.084342s
new_mkdir_p - doesnt exist, parent either
10.806k (± 4.6%) i/s - 54.108k in 5.020714s
Comparison:
new_mkdir_p - doesnt exist, parent either: 10806.3 i/s
old mkdir_p - doesnt exist, parent either: 9689.3 i/s - 1.12x (± 0.00) slower
Warming up --------------------------------------
old mkdir_p - more levels
702.000 i/100ms
new_mkdir_p - more levels
775.000 i/100ms
Calculating -------------------------------------
old mkdir_p - more levels
7.046k (± 3.5%) i/s - 35.802k in 5.087548s
new_mkdir_p - more levels
7.685k (± 5.5%) i/s - 38.750k in 5.061351s
Comparison:
new_mkdir_p - more levels: 7685.1 i/s
old mkdir_p - more levels: 7046.4 i/s - same-ish: difference falls within error
```
I think it's better to keep the code simpler is the optimization is not
so clear like in this case.
https://github.com/ruby/fileutils/commit/e842a0e70e
Previously, if creating the directory directly didn't work
and the directory didn't exist, mkdir_p would create all
directories from the root. This modifies the approach to
check whether the directory exists when walking up the
directory tree from the argument, and once you have found an
intermediate directory that exists, you only need to create
directories under it.
This approach has a couple advantages:
1) It performs better when most directories in path already exist,
and that will be true for most usage of mkdir_p, as mkdir_p is
usually called with paths where the first few directories exist
and only the last directory or last few directories do not.
2) It works in file-system access limited environments such as
when unveil(2) is used on OpenBSD. In these environments, if
/foo/bar/baz exists and is unveiled, you can do
`mkdir /foo/bar/baz/xyz` but `mkdir /foo` and `mkdir /foo/bar` raise
Errno::ENOENT.
https://github.com/ruby/fileutils/commit/ec0c229e78
`FileUtils#install` methed raises an unexpected `TypeError`, when
called with `mode:` option which has `"X"`.
```
$ ruby -rfileutils -e 'FileUtils.install("tmp/a", "tmp/b", mode: "o+X")'
/opt/local/lib/ruby/2.7.0/fileutils.rb:942:in `directory?': no implicit conversion of File::Stat into String (TypeError)
from /opt/local/lib/ruby/2.7.0/fileutils.rb:942:in `block (3 levels) in symbolic_modes_to_i'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:933:in `each_char'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:933:in `each'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:933:in `inject'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:933:in `block (2 levels) in symbolic_modes_to_i'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:931:in `each'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:931:in `each_slice'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:931:in `block in symbolic_modes_to_i'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:926:in `each'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:926:in `inject'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:926:in `symbolic_modes_to_i'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:973:in `fu_mode'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:883:in `block in install'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:1588:in `block in fu_each_src_dest'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:1604:in `fu_each_src_dest0'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:1586:in `fu_each_src_dest'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:877:in `install'
from -e:1:in `<main>'
```
In spite of that `symbolic_modes_to_i` considers the `File::Stat`
`path` case at the beginning, in `"X"` case, `path` is passed to
`FileTest.directory?` method which requires a `String`. In such
case, the mode in `path` should be examined instead.
https://github.com/ruby/fileutils/commit/af675af6b2
When `LANG=C`, dir is `UTF-8` and `base` is 'ASCII-8BIT` in `FileUtils::Entry_#join`.
So `Encoding::CompatibilityError` occurred and files are not removed.
https://rubyci.org/logs/rubyci.s3.amazonaws.com/arch/ruby-master/log/20200611T060002Z.fail.html.gz
```
1) Error:
WEBrick::TestFileHandler#test_cjk_in_path:
Errno::ENOTEMPTY: Directory not empty @ dir_s_rmdir - /home/chkbuild/chkbuild/tmp/build/20200611T060002Z/tmp/???20200611-1887828-3nn72a
/home/chkbuild/chkbuild/tmp/build/20200611T060002Z/ruby/lib/fileutils.rb:1460:in `rmdir'
/home/chkbuild/chkbuild/tmp/build/20200611T060002Z/ruby/lib/fileutils.rb:1460:in `block in remove_dir1'
/home/chkbuild/chkbuild/tmp/build/20200611T060002Z/ruby/lib/fileutils.rb:1471:in `platform_support'
/home/chkbuild/chkbuild/tmp/build/20200611T060002Z/ruby/lib/fileutils.rb:1459:in `remove_dir1'
/home/chkbuild/chkbuild/tmp/build/20200611T060002Z/ruby/lib/fileutils.rb:1452:in `remove'
/home/chkbuild/chkbuild/tmp/build/20200611T060002Z/ruby/lib/fileutils.rb:780:in `block in remove_entry'
/home/chkbuild/chkbuild/tmp/build/20200611T060002Z/ruby/lib/fileutils.rb:1509:in `ensure in postorder_traverse'
/home/chkbuild/chkbuild/tmp/build/20200611T060002Z/ruby/lib/fileutils.rb:1509:in `postorder_traverse'
/home/chkbuild/chkbuild/tmp/build/20200611T060002Z/ruby/lib/fileutils.rb:778:in `remove_entry'
/home/chkbuild/chkbuild/tmp/build/20200611T060002Z/ruby/lib/tmpdir.rb:97:in `mktmpdir'
/home/chkbuild/chkbuild/tmp/build/20200611T060002Z/ruby/test/webrick/test_filehandler.rb:292:in `test_cjk_in_path'
```
Musl libc has this function as a tiny wrapper of fchmodat(3posix). On
the other hand Linux kernel does not support changing modes of a symlink.
The operation always fails with EOPNOTSUPP. This fchmodat behaviour is
defined in POSIX. We have to take care of such exceptions.
`FileUtils#install` methed raises an unexpected `TypeError`, when
called with `mode:` option which has `"X"`.
```
$ ruby -rfileutils -e 'FileUtils.install("tmp/a", "tmp/b", mode: "o+X")'
/opt/local/lib/ruby/2.7.0/fileutils.rb:942:in `directory?': no implicit conversion of File::Stat into String (TypeError)
from /opt/local/lib/ruby/2.7.0/fileutils.rb:942:in `block (3 levels) in symbolic_modes_to_i'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:933:in `each_char'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:933:in `each'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:933:in `inject'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:933:in `block (2 levels) in symbolic_modes_to_i'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:931:in `each'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:931:in `each_slice'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:931:in `block in symbolic_modes_to_i'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:926:in `each'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:926:in `inject'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:926:in `symbolic_modes_to_i'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:973:in `fu_mode'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:883:in `block in install'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:1588:in `block in fu_each_src_dest'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:1604:in `fu_each_src_dest0'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:1586:in `fu_each_src_dest'
from /opt/local/lib/ruby/2.7.0/fileutils.rb:877:in `install'
from -e:1:in `<main>'
```
In spite of that `symbolic_modes_to_i` considers the `File::Stat`
`path` case at the beginning, in `"X"` case, `path` is passed to
`FileTest.directory?` method which requires a `String`. In such
case, the mode in `path` should be examined instead.
https://github.com/ruby/fileutils/commit/2ea54ade2f
Previously this would copy the symlink root as a symlink instead
of creating a new root directory. This modifies the source
to expand it using File.realpath before starting the copy.
Fixes Ruby Bug 12123
https://github.com/ruby/fileutils/commit/7359cef359
If FileUtils is included into another object, and verbose mode is
used, a FrozenError is currently raised unless the object has the
@fileutils_output and @fileutils_label instance variables.
This fixes things so that it does not attempt to set the instance
variables, but it still uses them if they are present.
https://github.com/ruby/fileutils/commit/689cb9c56a
Previously, this was broken. Trying to copy a FIFO would raise a
NoMethodError if File.mkfifo was defined. Trying to copy a UNIX
socket would raise a RuntimeError as File.mknod is not something
Ruby defines.
Handle the FIFO issue using File.mkfifo instead of mkfifo.
Handle the UNIX Socket issue by creating a unix socket.
Continue to not support character or block devices, raising a
RuntimeError for both.
Add tests for FIFO, UNIX Socket, and character/block devices.
https://github.com/ruby/fileutils/commit/123903532d
Cfuncs that use rb_scan_args with the : entry suffer similar keyword
argument separation issues that Ruby methods suffer if the cfuncs
accept optional or variable arguments.
This makes the following changes to : handling.
* Treats as **kw, prompting keyword argument separation warnings
if called with a positional hash.
* Do not look for an option hash if empty keywords are provided.
For backwards compatibility, treat an empty keyword splat as a empty
mandatory positional hash argument, but emit a a warning, as this
behavior will be removed in Ruby 3. The argument number check
needs to be moved lower so it can correctly handle an empty
positional argument being added.
* If the last argument is nil and it is necessary to treat it as an option
hash in order to make sure all arguments are processed, continue to
treat the last argument as the option hash. Emit a warning in this case,
as this behavior will be removed in Ruby 3.
* If splitting the keyword hash into two hashes, issue a warning, as we
will not be splitting hashes in Ruby 3.
* If the keyword argument is required to fill a mandatory positional
argument, continue to do so, but emit a warning as this behavior will
be going away in Ruby 3.
* If keyword arguments are provided and the last argument is not a hash,
that indicates something wrong. This can happen if a cfunc is calling
rb_scan_args multiple times, and providing arguments that were not
passed to it from Ruby. Callers need to switch to the new
rb_scan_args_kw function, which allows passing of whether keywords
were provided.
This commit fixes all warnings caused by the changes above.
It switches some function calls to *_kw versions with appropriate
kw_splat flags. If delegating arguments, RB_PASS_CALLED_KEYWORDS
is used. If creating new arguments, RB_PASS_KEYWORDS is used if
the last argument is a hash to be treated as keywords.
In open_key_args in io.c, use rb_scan_args_kw.
In this case, the arguments provided come from another C
function, not Ruby. The last argument may or may not be a hash,
so we can't set keyword argument mode. However, if it is a
hash, we don't want to warn when treating it as keywords.
In Ruby files, make sure to appropriately use keyword splats
or literal keywords when calling Cfuncs that now issue keyword
argument separation warnings through rb_scan_args. Also, make
sure not to pass nil in place of an option hash.
Work around Kernel#warn warnings due to problems in the Rubygems
override of the method. There is an open pull request to fix
these issues in Rubygems, but part of the Rubygems tests for
their override fail on ruby-head due to rb_scan_args not
recognizing empty keyword splats, which this commit fixes.
Implementation wise, adding rb_scan_args_kw is kind of a pain,
because rb_scan_args takes a variable number of arguments.
In order to not duplicate all the code, the function internals need
to be split into two functions taking a va_list, and to avoid passing
in a ton of arguments, a single struct argument is used to handle
the variables previously local to the function.