Since we have some subscribers, I thought I’d let you know that the work has been resumed very officially and that first steps towards a version 1.3 are in full progress.
What I plan to introduce are …
Source code level:
- const-correctness
- Namespaces (new own namespaces and proper usage of existing ones)
- C++ typecasts where possible
- Moving #defines to real C++ constants where possible
New features:
- Update check that can be enabled by the user to receive notifications
- Better mouse navigation
- Save a scan and load it later
- Save report into file (perhaps different formats)
- Proper handling of WOW64 environments
- Explicit Windows Vista symlink handling
- Configuration from different sources (incl. .ini file)
- Native x64 version
- Update: A brand new logo that has been created for and donated to the project (I will likely post a preview soon)
Features that will be removed:
- Support for Windows 9x/ME will be dropped in this upcoming release. Affected users can use the last 1.2 version …
- Support for Windows NT 4.0 may get dropped
- The feedback (bug report) dialog will be replaced by a link to our website where you will be able to enter descriptions of undesired behavior or simple praise
- The send report by mail feature will be removed, instead you will be able to save reports to at least one computer-readable and at least one human-readable file format
// Oliver
What about hardlink support?
What kind of hardlink support do you have in mind?
There is no trivial way to know more about a file than the number of (hard)links, thus there is no easy way of knowing which “files” (i.e. which links to the same data) take up only one time the size of the data, or multiple times. In addition to the current memory requirements this would mean to add more without knowing whether the user has any utility from that – e.g. a user may have not a single hard link on the machine.
// Oliver
The windows api tells you how many other links there are to the same data, and gives you a unique id for that data.
http://msdn.microsoft.com/en-us/library/aa364952.aspx
You could show hardlinked files as being size / number of seen links, or perhaps just showing the first file at full size.
Hope this helps.
Oh, I am well aware of this API, just as I am aware of the underlying native APIs. However, this doesn’t change my assertion from before.
In addition to the name and the number of links I’d have to store a 64bit number, out of which actually only 48bit are used, so I could fit the number of links (max. 1024) into the upper 16bit of the 64bit integer. So we add 8 Byte of memory requirement for each and every item, not to speak of the additional CPU power required (for each item as well). In order to allow easy lookup and computation of the sizes of hardlinked items an additional array would be required that contains pointers (4 or 8 Bytes – 32bit or 64bit respectively, depending on the bitness of the program) to each item with more than one link. Do you understand what this means?
I’ll think of it again, but this is too much tradeoff for too little value, in my opinion.
While the latter array will only grow whenever hardlinks are encountered, the former will grow unconditionally.
Agreed, the idea has something, but I need a little more convincing and will see how I can possibly decrease the memory requirements of the idea instead of sacrificing WDS’ snappy behavior and relatively slim memory requirements.
// Oliver
Okay, I do have one first idea, but it needs some more thorough review …
I’m really looking forward to saving the info to a file in 1.3. Can’t wait! Good luck!
Some command line options, like generation of report files would be such a nice thing to have! I could schedule a task in the middle of the night and have a look at the report in the morning. I’m looking at 10′s of TB of data with this tool cause it’s just so intuitive and so very easy to spot the 8GB WALL.E.1080p.BluRay.x264.HD1080.XLS files
Also reports in a format that is easy to use from a script would be very, very useful.
I vote for hard link support too. The unix du utility does hard link tracking, and it’s quite handy. The size is reported only the first time an inode is encountered. Any other links to the same inode are ignored.
I think your concerns about memory may be overstated. At 8 bytes per entry, that’s 8MB per million files. With the low cost of memory these days, that’s a pretty good tradeoff in my opinion. At least having the option would be wonderful.
I second the call for command line options.
Otherwise, great software!
Thanks,
Matt
I used Windirstat in the past few years, and keep using it…
But recently I had several ideas : would it be possible to implement an image snapshot of the filemap ? ( bmp ? gif)
Another idea would be to even enhance the display passing to the 3D representation of a drive … Combine the square with some depth ?
Isometric display system ?
just my recent ideas …
thanks for a great tool
Hello Oliver. Thanks for the update on the development process.
I can’t resist to post 2 feature requests:
Use se_backupprivilege when run as admin so “system volume information” is scanned.
It would be neat to be able to copy the result to the clipboard as vector graphics and export it in a vector format.
The first one is planned, the second one is something that will probably never make it into the core. But I can imagine to provide the means for a plugin to do it. Probably not in 1.3, though … but later.
// Oliver
Oh, concerning the backup privilege. Please note that this has another implication: it’ll likely slow down the scanning of the hard drive, because the privilege has to be requested per thread (not just global to the process) …
I use WindirStat all the time to help in managing disk space on file servers and sans.
some features I wish for are:
Save scans
Save the bitmaps
Some way to report on file properties like: a report of files last accessed before June, 17 2002.
or reporting on file ownership to learn how much data is owned by user X on a volume.
One thing I’d ask for is to have the ‘save a scan’ and ‘save a report’ feature available in at least one version that supports Win 9x and Me. I occasionally have to administer very old win9x and Me systems, and considering their speed, it would be exceedingly handy to run a scan overnight (or have someone else run it), then collect the scan file (or files, if this is being run on multiple systems) from them, and bring the file/s to a faster computer for analysis.
Someone needing this assistance would be able to run and save the scan, but wouldn’t know which files were safe to remove. If they could get those files to me, I could drill through them and compile a list of files to remove, then remove them in one shot, and then possibly re-run the scan overnight if necessary. If my only option were to run the utility, wait on-site perhaps a long time for it to come up, then delete files on that system itself, it would be much more difficult to use the utility.
Thanks for making it available (and easily portable). It’s on my shortlist of maintenance utilities in my flash-drive toolkit.
Hardlink awareness would be great, even if it costs some memory or CPU-cycles.
Thanks for this wonderful program!
It would be really nice to have some kind of Windows Explorer integration, via context-menu with an option “Scan with WinDirStat” or alike for drives and folders.
Support for Windows 9x/ME will be dropped? Does 9x mean XP? I intend to hold out on Xp for a while. Anyhow I really think this is one of the best software projects out there and really appreciate your work Thank You.
I do have a wish list.
1. I would like some way to export file structure to Visual Understanding Environment (VUE) an Open Source project located here http://vue.tufts.edu/index.cfm
2. Also you might find this project interesting if you ever thought to go the 3d route. http://www.tibsoft.com/index.php?page=steptree
StepTree v1.8. I like this approach but I think you could take it to the next level as your software is by far the best. I believe the future of file management will end up as a kind of mind map structure with a zoomable interface or ZUI.
3. If you are implementing a way to save search’s a favorites menu would be nice for organizing those search’s.
Thanks again
David Prouty
Is there any way to export the file number, size and percentage data into a text file or spreadsheet?
IMHO, hardlink support is essential.
in windows vista and windows 7, the winsxs directory is almost entirely made of hardlinks.
scanning such a system partition with windirstat yields totally false reports.
Then the only method is not to use it. Windows is not quite proliferate in offering support for entities like hardlinks. And it is certainly not an option to query every single file for its number of links and its file ID.
The only method I see is MFT parsing and, as mentioned elsewhere, I am working on that.
// Oliver
Quote – Oliver:
“In addition to the current memory requirements this would mean to add more without knowing whether the user has any utility from that – e.g. a user may have not a single hard link on the machine.”
Maybe hardlink support can be implemented as an option/checkbox and be disabled by default. Advanced users who are willing to sacrifice some of the “snappy behaviour” and low memory footprint can enable it if required (you might add the notion “experimental” or “slow” or something like that), and Joe Sixpack isn’t bothered with unneeded functionality.
Keep up the good work!
Well, the interesting fact is that you would rather gain snappy behavior by “caching” stuff (i.e. more use of RAM, less use of CPU).
However, in fact some information is just “more expensive” in terms of CPU, IO and RAM. In case of MFT parsing, the hard links don’t add any such problems. But MFT parsing introduces other issues (the MFT is more or less a plain list, not a hierarchy like the directories are on disk … thus all contents need to be put into the right form before being able to evaluate it).
My biggest concerns with MFT parsing are about potential security issues (unprivileged user unable to use it or so) and the extended use of memory. But the advantages outweigh all of that so far.
For now, good night
(2:08am here)
I’d also like to cast my vote for command-line switches for WinDirStat. I’d really like the ability to automate/schedule email reports.
Actually the email reporting feature is going out. Reports will instead be saved into files and you will have to use a tool designed to send emails (such as blat) to do that part.
Great tool.
I’d suggest a command line option which controls the n level of subdirectories to scan. It will be useful for tailoring the scan results.
A big +1 vote for command line options – I’d love to be able to output the treemap bitmap to file as a scripted task. Then I could have self-updating web-based reports with pictures. But still, what an awesome tool already! We love your work Oliver.
Hi,
how do i speed it up when i point it to a network share?
pointing it to a 1 tb drive and it takes more than 16hours to read it.
running 2008 r2 – smb2
check about HARDLINKS management. Is it good now?
@cavallogoloso: what do you mean? There is no ideal solution for hardlinks and no one has as of yet come up with a brilliant idea …
i mean simply: consider them: calculate the space “if they where real” (for example: one can calculate the space used by files “if i copy all these files out of this disk”) also with an item in the list of extensions dedicated to hardlinks >1 for each datablock
But i am considering that this feature could be useful for a very few people
Superb tool and thanks!
Would like to see the following features given the whole work/life/family/home balance!
1) Command-line support to allow easy scripting (previously mentioned)
2) More extensive user-defined reporting – somewhat like creating filters in SequoiaView (see 3)). Ability to take nightly snapshot bitmaps of the file system – could lead to animation of file system view?
3) File filtering/masking – only shows me multimedia files or documents. Ignored files get accumluated in an ignored bucket for display or treemap block.
4) Option to show Owner, Creator, Modified tipstrip on any file in the map view e.g. who dumped the 14GB file on the local server?
5) User-definable colours on the map.
Just some thoughts!
-Graeme
Hi,
Here a little windows registry script (copy/paste in a file named something.reg) to include windirstat in directory extension explorer of windows (I find this very useful to call windirstat, maybe, it can be included in the installer, because the installation path have to be set ):
I agree that ‘perfect’ hardlink support is impossible, and ‘very good’ might be expensive. Here is a suggestion for a compromise that I think would be useful and not too expensive.
Currently you must have a cumulative size counter for each file/folder, as well as various other cumulative counters such as Items. Use instead two such size counters. One is exactly as now. The second accumulates [size/numberOfHardlinks]. There would be a toggle for displaying percentages and the treemap using the original size or the compensated size.
The new figure should be correct at the disk level. It would need to be interpreted with care when looking at subdirectories; but would still be useful.
Another useful feature would be a cumulative ‘Last Accessed’ field. This would help a decision to delete/migrate/backup a large directory none of whose files had been accessed for a considerable period.
By the way: thank you for an excellent program.
I was researching hardlinks and symbolic links as refresher for myself. I had found these on Microsoft’s website.
Can’t you just use the FILE_ATTRIBUTE_REPARSE_POINT 1024 (0×400) attribute from FindFirstFile to detect linked files? And if a file is a link, exclude its size from the total?
I’m aware of that attribute, but a program that is handling files should probably be able to distinguish volume mount points, junction points and symbolic links. And IIRC only symbolic link targets don’t have to exist.
Is there a download available for v1.3 yet or a planned release date?
This is a great tool, keep up the good work.
Very useful tool. Would like to see a way to export reports, as not every system has email client software installed.