[rustdoc] Rustdoc should prevent long file names.
I'm not actually certain if handling absurdly long method names should be a goal for rustdoc, but currently they produce filenames that are too long for the filesystem. I discovered this while working with opencv-rust which auto generates c-wrapper methods for opencv, some of which have very long names. A possible proposed solution was that rustdoc abbreviate filenames, I suggest that a good solution might be splitting the name into 255 char chunks.
Eg the problem file: /home/benjamin/dev/opencv-rust/target/doc/opencv/sys/fn.cv_calib3d_cv_solvePnPRansac_InputArray_objectPoints_InputArray_imagePoints_InputArray_cameraMatrix_InputArray_distCoeffs_OutputArray_rvec_OutputArray_tvec_bool_useExtrinsicGuess_int_iterationsCount_float_reprojectionError_int_minInliersCount_OutputArray_inliers_int_flags.html
Would become something like: /home/benjamin/dev/opencv-rust/target/doc/opencv/sys/fn.cv_calib3d_cv_solvePnPRansac_InputArray_objectPoints_InputArray_imagePoints_InputArray_cameraMatrix_InputArray_distCoeffs_OutputArray_rvec_OutputArray_tvec_bool_useExtrinsicGuess_int_iterationsCount_float_reprojectionErr/or_int_minInliersCount_OutputArray_inliers_int_flags.html
Previous discussion on the rust subreddit: https://www.reddit.com/r/rust/comments/4m29tk/solution_to_file_name_too_long_for_cargo_doc/
Small update: This also appears to be some regression of sorts, after switching back to the
1.8.0toolchain with rustup I can document the crate as these methods are skipped.Edit, these methods are marked
#[doc(hidden)]. Perhaps then the correct answer is a way to make rustdoc not generate docs for hidden methods?Benjamin Elder at 2016-06-01 19:09:15
Edit, these methods are marked #[doc(hidden)]. Perhaps then the correct answer is a way to make rustdoc not generate docs for hidden methods?Edit, these methods are marked #[doc(hidden)]. Perhaps then the correct answer is a way to make rustdoc not generate docs for hidden methods?
It isn't supposed to.
Guillaume Gomez at 2016-06-01 19:35:21
As far as I can tell, this method is in /target/debug/build/opencv-<hash>/out/calib3d.extern.rs and marked #[doc(hidden)]. It is then
include!-ed in /target/debug/build/opencv-<hash>/out/hub.rs in a public sys module.As far as I can tell now, it should be marked hidden and excluded from the docs, but 1.9.0 and nightly appear to attempt to build docs for these methods anyhow. On Jun 1, 2016 15:36, "Guillaume Gomez" notifications@github.com wrote:
Edit, these methods are marked #[doc(hidden)]. Perhaps then the correct answer is a way to make rustdoc not generate docs for hidden methods?Edit, these methods are marked #[doc(hidden)]. Perhaps then the correct answer is a way to make rustdoc not generate docs for hidden methods?
It isn't supposed to.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/rust-lang/rust/issues/34023#issuecomment-223100832, or mute the thread https://github.com/notifications/unsubscribe/AA4BqyzC_f0KZBTzAcplxERtFU9JXBgxks5qHd8jgaJpZM4Ir3Qa .
Benjamin Elder at 2016-06-01 19:45:58
I've opened a new issue (https://github.com/rust-lang/rust/issues/34025) to reflect what seems to be the actual problem, but I think that avoiding large file names might also be worth discussing so I will leave this open as well for now.
Benjamin Elder at 2016-06-01 20:19:04
Couldn't we just shorten the name? Come up with some pattern, maybe take inspiration from DOS style short filenames, like shorten it to 100 characters and anything beyond that is replaced with ~1 or a higher number if another file has the same name, or maybe tack on a short hash of the name.
Peter Atashian at 2016-06-01 20:43:05
I'd go with the hash since, as long as it's well-defined (eg. take a prefix of
n - hash_lengthcharacters, then hash the remaining characters using<name of hash>, then append the hash to the prefix), it's possible to re-generate it from the source materials and get the same result if necessary and it'll be more stable across runs, in case incremental rebuild is ever desired.Stephan Sokolow at 2016-06-01 21:46:44
Triage: no change
Steve Klabnik at 2018-10-31 20:11:36
As of Windows 10 patch 1607 (which happened later in 2016, actually), users can now remove the path length limitations, and patches before that are only supported for Enterprise LTSC.
Jubilee at 2021-12-05 23:15:28
most linux filesystems still have the limitation of 255 bytes per path segment, and 4096 bytes for the entire path, at least according to
linux/limits.h.additionally, it seems that search engines don't like urls longer than 2000 chars
lolbinarycat at 2024-11-07 22:49:45
@lolbinarycat
linux/limits.his for libc functions that have internal limitations. It does not constrain the filesystems and most Linux filesystems have no path length limit.See, for example, the
getcwd(3)manpage:getwd()does notmalloc(3)any memory. Thebufargument should be a pointer to an array at leastPATH_MAXbytes long. If the length of the absolute pathname of the current working directory, including the terminating null byte, exceedsPATH_MAXbytes,NULLis returned, anderrnois set toENAMETOOLONG. (Note that on some systems,PATH_MAXmay not be a compile-time constant; furthermore, its value may depend on the filesystem, seepathconf(3).) For portability and security reasons, use ofgetwd()is deprecated.See also:
The TL;DR is that
linux/limits.hdefines a limit for paths handled by syscalls but, given that you can always break it by mounting a filesystem withPATH_MAX-length paths on a mountpoint deep inside another filesystem, it cannot be absolute, and programs/libraries which want to bypass that limitation will work around that by reimplementing by walking up the chain of ancestors and assembling the full path in userspace.(As far as I'm aware, all filesystem operations are now supported by newer relative syscalls like
openatwhich let you circumvent having to work with absolute, canonicalized paths at some point inside the kernel. Just "this path, relative to that FD" or "this path, relative to that inode".)I can confirm that I can generate test paths on ext4 which are over 5000 characters long, despite
PATH_MAXbeing 4096 on my system... which does cause some things to then fail to get the working directory.Stephan Sokolow at 2024-11-08 03:58:10
true, but you'll still encounter issues if you do the trivial approach, and NAME_MAX is still a bit tricky to get around.
lolbinarycat at 2024-11-08 05:43:51
Certainly. I just think it's important to make it clear that "most linux systems" don't have such a limitation... it's just certain standard library APIs that have the limitation.
(Basically, to avoid the problem I've had to work around with Serde where it can fail to serialize ext4 mtimes before the POSIX epoch that occurred due to something like metadata corruption on an old FAT12 floppy disk.)
Stephan Sokolow at 2024-11-08 05:46:48
Even if most platforms support it fine, it's still unreasonable to have such massive filenames that really don't provide any tangible benefit to the user. About the only reason I can think of to not do this is because it would break links to affected doc pages, and I personally think it's still worth doing.
Peter Atashian at 2024-11-11 00:04:50
To be fair, you could also argue that having type names that are that long is also unreasonable.
lolbinarycat at 2024-11-11 01:33:49