WizTree is famously almost 50x faster than WinDirStat (on normal Windows NTFS dr...

mardifoufs · on May 23, 2024

What's the downside of just reading the MFT? Why doesn't Microsoft do it in file explorer, and why wouldn't every tool use it instead of walking through the file system? Maybe there's no downside but it's such a huge speed boost that it would be weird to not use it otherwise, right?

jasode · on May 23, 2024

>What's the downside of just reading the MFT? Why doesn't Microsoft do it in file explorer, and why wouldn't every tool use it instead of walking through the file system?

One disadvantage is that you can't read the MFT of network shares or device emulators presenting "virtual drive letters" to the OS.

The typical (and slower) Win32 API functions FindFirstFile()/FindNextFile() used to iterate through the files structure work at a higher level of abstraction so they work on more targets that don't have an NTFS MFT. Indeed, if you point WizTree to a SMB network share, it will be a lot slower because it can't directly read the MFT.

It's conceivable that Microsoft developers could have programmed Windows Explorer differently to have an optimized code path of reading MFT for local disks and then fall back to slower FindFirstFile()/FindNextFile() for non-MFT disks. Maybe that adds too much complexity and weird bugs. I notice that most of the 3rd-party "Win Explorer replacement" utilities also don't read MFT.

robertlagrant · on May 23, 2024

> It's conceivable that Microsoft developers could have programmed Windows Explorer differently to have an optimized code path of reading MFT for local disks and then fall back to slower FindFirstFile()/FindNextFile() for non-MFT disks

Surely this would have been worth doing, even if it meant flushing out bugs elsewhere.

becurious · on May 23, 2024

Along with the reasons others have mentioned, it would also bypass any filter driver in the file system stack (Windows has the concept of a stack of filter drivers that can sit in front of the file system or hardware) and would also ignore any permissions (ACLs) on who can see those files. There’s no way they can credibly use this technique outside of say something from SysInternals: it violates the security and layering of the operating system and its APIs.

mardifoufs · on May 23, 2024

Is there a Linux equivalent for those "filters"? I'm a bit clueless about win32 and NT sadly enough...

Would that mean that there's no way to "scope" the MFTs?

Edit: That also makes sense, since if I got it right they aren't necessarily supposed to be consumed by userspace programs?

I guess that's why those tools always ask for admin access and basically all perms to the FS.

It's a bit sad that the user gets exposed to a much slower search and FS experience even if the system underneath has the potential to be as fast as it gets. And I don't think ReFS is intended to replace NTFS (not that it's necessarily more performant anyways)

wongarsu · on May 23, 2024

There is no equivalent on Linux. That's why linux has no online antivirus scanners (scanners that scan the file as it's opened) while this is a basic feature of every antivirus program on Windows.

Linux has device mappers (dm-crypt, dm-raid and friends). But those sit below the file system, emulating a device. Window's file system filter drivers sit above the file system, intercepting API calls to and from the file system. That's super useful if you want to check file contents on access, track where files are going, keep an audit log of who accessed a file, transparently encrypt single files instead of whole volumes, etc. But you pay the price for all that flexibility in performance.

Spivak · on May 23, 2024

Sure there is, you're talking about fanotify.

https://man7.org/linux/man-pages/man7/fanotify.7.html

https://lwn.net/Articles/339399/

It even lets you block the access until the scan/decision is made.

account42 · on May 28, 2024

> That's super useful if you want to check file contents on access, track where files are going, keep an audit log of who accessed a file, transparently encrypt single files instead of whole volumes

Or if you just want to generally make the filesystem so slow that everyone has to invent their own pack files just to avoid file system api calls as much as possible.

SSLy · on May 23, 2024

What are the APIs related to this named?

mastax · on May 23, 2024

IO Minifilter drivers are the modern version: https://learn.microsoft.com/en-us/windows-hardware/drivers/i...

loeg · on May 23, 2024

Filters are vaguely similar to things like mountpoints overlaying portions of the filesystem. E.g. in Linux you might have files in /d1/d2/{f1,f2,f3} in the root filesystem but you also have a mountpoint of a 2nd filesystem on /d1/d2 that completely changes the visibility / contents of d2. Filter drivers can do similar things (although they are not actually independent mountpoints).

webstrand · on May 23, 2024

I believe they're approximately equivalent to FUSE

RobotToaster · on May 23, 2024

> it violates the security ... of the operating system

Maybe stating the obvious, but if the security can be violated that easily, it's not very secure.

wongarsu · on May 23, 2024

You need admin permissions to read the MFT on Windows. The traditional security model of both Windows and Linux assumes that the kernel is a security barrier between system and unprivileged user, and between different unprivileged users. An admin being able to bypass security restrictions isn't traditionally seen as a problem.

freedomben · on May 23, 2024

Indeed, only in very recent history has the admin/root user/owner been seen as a threat to the system and the system employs defenses against them. I'm hoping that trend reverses because I really hate the direction things are going.

slaymaker1907 · on May 24, 2024

There are pretty good reasons to do that. We've been really lax in what is allowed to run as root/admin when in reality, those permissions should only be used when doing things like reading the MFT or snooping on all the network traffic with Wireshark. It should not be required to run as root/admin in order to install most software because installing software is a very common thing to do.

Even if you want more control over your system, I still think technically capable people would be better served by having a separate administrator account from your normal day-to-day account which you have to explicitly log into (so no UAC prompts, you need to go onto that other account and then you get the UAC prompt). Unfortunately, I think most Desktop OSes are still too unusable with this sort of workflow due to how much software insists on admin for installation.

freedomben · on May 24, 2024

I largely agree. I think what makes the "the user is a threat" model so difficult to me is that there is a lot of truth to it. Users often don't know enough to make good decisions.

I really like your idea of logging in separately, such that is isn't something you're going to do cavalierly. That seems like a great compromise to me! I fully agree that we way overuse admin and really don't need it for the majority of things.

RobotToaster · on May 24, 2024

Then it doesn't violate the security of the OS, if you need to be an admin to do it.

skissane · on May 25, 2024

> it would also bypass any filter driver in the file system stack

The main use case for filter drivers is antivirus, and that is primarily about file contents not file metadata - so if MFT access bypassed filter drivers, that might not be a major issue. I think most non-antivirus use cases are also primarily about data not metadata.

If necessary, one could even devise a design in which MFT access is combined with filter drivers - MFT scanning to find matching files, then for each matched file access its metadata via standard APIs (to ensure filter drivers are invoked) before returning to client. That would be slower than a pure MFT scan but still faster than a scan done purely with standard APIs. A registry key could turn this on/off so sites can decide for themselves where to place the performance versus security tradeoff

> and would also ignore any permissions (ACLs) on who can see those files

They could expose an API which enables MFT scanning with some degree of ACL checking added.

If you do the ACL check as late as possible in processing the query, it would give much better performance than standard APIs that evaluate ACLs on every access. For example, suppose I want to scan a volume for all files with the extension ‘*.exe’. The API would only have to do an ACL check on each matching entry, not on every entry it considers.

There also might be reasonable situations in which ACL checking could be bypassed. For example, if I am requesting a search for files of which I am the owner, just assume the owner should have the right to read the file’s metadata. Or, if I have read permission on a directory, assume I am allowed aggregate information on the count and total size of files in that directory and its recursive subdirectories. These “bypasses” could be controlled by system settings (registry entries / group policy), so customers with higher security needs could disable them at the cost of reduced performance.

Rather than putting this in the OS kernel, it could be a privileged system service which exports an API over LPC/COM/etc. Actually with that design it isn’t even necessary to wait for Microsoft to implement this, it could always be implemented as an open source project, if someone felt sufficiently motivated to do so. (Or even as a proprietary product, although I suspect that would limit its adoption, and the risk is if it takes off, Microsoft would just implement the same thing as a standard part of Windows.)

password4321 · on May 23, 2024

Reading the MFT directly requires Administrator permissions, and doing it correctly means reimplementing support for every nook and cranny of NTFS including things like hard links, junction/reparse/mount points, sparse files, etc.

hd4 · on May 23, 2024

Spacemonger uses the MFT and doesn't require Administrator privileges

justsomehnguy · on May 23, 2024

AFAIR MFT access requires Administrator/SYSTEM rights and there is absolutely no way to read it as a regular user.

The only workaround (used by Everything by VoidTools) is to install a service which would run with a needed rights and communicate with it in the GUI.

faeriechangling · on May 23, 2024

You call that a workaround but it’s basically the best possible situation security-wise. If this didn’t work securely then it wouldn’t be possible to implement disk defragmenter or even explorer. It’s so core to Windows NT’s security model that I wouldn’t call it a workaround.

You do similar things even with more modern stacks - assign a permission to an application and grant permissions to the application to the user.

The only real concern is that Windows NT permissions are not as granular as they could be.

mananaysiempre · on May 23, 2024

> Windows NT permissions are not as granular as they could be.

For objects, Windows NT permissions are ridiculously granular; e.g. GENERIC_WRITE can be mapped to a half-dozen separately settable type-specific flags, depending on the object type (file, named pipe, etc.). It’s too granular for even an administrator to make sense of, arguably, and the documentation is somewhere between bad and nonexistent. (The UI varies from decent, like the ACL editor you can access from e.g. Explorer, to “you can’t make this shit up”, like SDDL[1].)

For subjects, the situation is not good, like on every other conventional OS. You could deal with that by introducing a “user” for each app, as on Android. But I’m not aware of any attempts to do that (that would expose this mechanism in a user-visible way).

(Then there’s the UWP sandbox, which as far as I tell is build with complete disregard of the fundamental concepts above. I don’t think it’s worth taking seriously at this time.)

[1] https://learn.microsoft.com/en-us/windows/win32/secauthz/sec...

faeriechangling · on May 23, 2024

I have no idea if there’s a granular object permission that could give access to the MBR of a disk. I’ve thankfully never had to dig that deep into Windows internals.

I’ve had to work with SDDL before to setup granular permissions for WMI monitoring on a whole lot of computers and my god, did it make me love the Cloud and Linux. I can’t emphasize enough how unintuitive setting these permissions is creates systemic over privileging.

smusamashah · on May 23, 2024

Is this the Spacemonger you are talking about https://web.archive.org/web/20121126062443/http://www.sixty-...

It does not say anything like that in FAQ and i don't remember it being fast.

hd4 · on May 23, 2024

Yes that one. Just use it and see. It's blazing fast.

adzm · on May 23, 2024

It uses FindFirstFile etc https://github.com/seanofw/spacemonger1/blob/6a41c012534b170...

password4321 · on May 23, 2024

I thought you meant the $15 utility from Stardock, but if not then I'm fairly confident it's not reading the MFT.

https://github.com/seanofw/spacemonger1/blob/6a41c012534b170...

hd4 · on May 23, 2024

It's still interesting that they got it to work as fast and precise as they did.

smusamashah · on May 23, 2024

Just learned that its open source now https://github.com/seanofw/spacemonger1

soylentcola · on May 23, 2024

Been using the portable version of 1.4 for decades after first coming across it in some PC magazine or something like that many years ago. Not terribly pretty, but it does what I need and it still works.

dspillett · on May 23, 2024

> What's the downside of just reading the MFT?

One possible reason is that it isn't a published part of the filesystem's external interface, and the format is not guaranteed to be static between versions or even point releases (though in reality, while the behaviours may be officially undefined that are unlikely to change significantly).

Also, it requires admin elevation to access. Anything running elevated is a potential security concern as it can access much else too.

> Why doesn't Microsoft do it in file explorer

Not sure, but it could be because that would be seen as an unfair advantage so to avoid anti-trust allegations they would have to publish the format and make stability guarantees for it, so others could use it as easily/safely. That, and the reasons above & below too.

> and why wouldn't every tool use it instead of walking through the file system?

Largely because walking the filesystem works for all filesystems, local and remote, so you cover everything with one tree walk implementation. Implementing a tree-walk over the MFT data where available is extra work to implement and support for one filesystem, and not many care enough, or are not aware of the potential speed benefit at all, for it to be a huge selling point such that all toolmakers feel compelled to bother.

WaitWaitWha · on May 23, 2024

> One possible reason is that it isn't a published part of the filesystem's external interface, and the format is not guaranteed to be static between versions or even point releases (though in reality, while the behaviours may be officially undefined that are unlikely to change significantly).

I am not going to pull every document, but the MFT structure is documented and published. I am uncertain what you mean by "external interface".

"About 9,810 results (0.04 sec)"

https://scholar.google.com/scholar?hl=en&as_sdt=0%2C11&q=mft...

userbinator · on May 23, 2024

Moreover it is documented by Microsoft itself: https://learn.microsoft.com/en-us/windows/win32/devnotes/mas...

dspillett · on May 24, 2024

Though all the sub-pages of that state things like “[This structure is valid only for version 3 of NTFS volumes; it may be altered in future versions.]” — while it is true that any API could see breaking changes in future, this suggests that you should expect them, so I'd not call it supported in the same sense of the main file/directory access APIs which I would not expect to see breaking changes in (additional properties & functionality yes, but not existing things changing behaviour).

dspillett · on May 24, 2024

A lot of people talking about the details, does not constitute official documentation though.

You can find a lot of articles talking about SQL Server's DBCC IND and DBCC PAGE, but that isn't official documentation – they are essentially internal functions and not supported and could change or go away entirely despite having been around for many versions, as they have in Azure). Similarly there articles talking about sys.dm_db_database_page_allocations which sort-of does the job of DBCC IND, but again this is not officially documented & supported.

> I am uncertain what you mean by "external interface".

I meant the published interface. Maybe "supported API" would have been a better phrase to use?

Though as pointed out below, there is at least some official documentation on the MFT structure.

loeg · on May 23, 2024

It's probably also racy to access the raw MFT while there are concurrent programs creating new files (or deleting files). That complication can be avoided by using the ordinary OS directory iteration primitives.

hi-v-rocknroll · on May 23, 2024

Yep but then the tradeoff of performance gains are completely discarded. The easiest solution is to take a snapshot with VSS, which is both fast and makes a quiesced copy of $MFT. From there, one could monitor FS changes if they wanted to have live updates.

hi-v-rocknroll · on May 23, 2024

With RAM sizes now, it's curious why any OS wouldn't just cache some or all of metadata for some local volumes on a block basis rather incur the greater resource usage of transforming disk into different structures, and then caching and track individual entries.

matthews2 · on May 23, 2024

More MFT goodness: the file search tool Everything (https://www.voidtools.com/)

jug · on May 23, 2024

It's crazy how the Windows Search Indexer still doesn't use MFT.

It doesn't even bloody support network drives so there's no such reason.

rezolva · on May 23, 2024

I am building an advanced filemanager (FileNinja) for Windows with full integrated everything search & query. you have the option of saving bookmarks to virtual folders that consist of everything searches. Instant directory sizes, tags, custom file descriptions for ntfs. Anyone interested? https://youtu.be/JREufgkf5pk?si=sP05UCOrskpX8OTq

Multicomp · on May 23, 2024

I'm interested! Great marketing video by the way, a good example of using AI-powered voiceovers to level up the one-man-marketing polish capabilities.

walt_wu · on May 24, 2024

Cool, but what is the biggest feature compared to Everything and Listary?

jron · on May 23, 2024

Do you have a git repo to follow?

joakim0 · on May 23, 2024

You can check out the following sites https://github.com/sandeberger, http:\\thefile.ninja or my homepage at https://kodar.ninja. The project is not opensource.

SuperHeavy256 · on May 23, 2024

haha I like the voiceover, the video is fun

xen2xen1 · on May 23, 2024

So that's why Everything is so fast. Nice.

LelouBil · on May 23, 2024

I want to like Everything but every time I start it up it takes 30 sec to 1 minute to update it's index

letmevoteplease · on May 23, 2024

Try Everything 1.5a - an "alpha" version with many improvements, in development for years but inexplicably hidden away on their website. Never experienced any instability.

https://www.voidtools.com/forum/viewtopic.php?t=9787

skeaker · on May 23, 2024

Wow! Shocked that this is the first I've heard of this given that I've been using Everything for years now. Thanks for the link.

kuro_neko · on May 23, 2024

Love everything, but I had no idea there was an update... I'll have to try it right away, thanks!

xnx · on May 23, 2024

You can set it to run on startup or as a service so it updates the index in the background.

naikrovek · on May 23, 2024

You should not be starting it when you want to search. You should open it when you log in, and leave it in the tray. It will do a full index on launch then subscribe to filesystem notifications to keep itself up to date for as long as it’s open.

Do that and it’s alarmingly fast and responsive except for the minute or two right after launch.

benjaminpv · on May 23, 2024

Contrasting seemingly all the other responses to this, I use it the same way you do (only opening it when needed) and I'm fine with the delay: even at its slowest rebuilding the index and searching is faster than the in-built windows Search.

Nexxxeh · on May 23, 2024

Uninstall, re-install as a service which may now be default.

ziml77 · on May 23, 2024

Better as a service too because the GUI doesn't need to request admin rights.

Saris · on May 23, 2024

It should be starting at boot if you installed it as a service, so the indexing will be done then. After that opening and searching is instant.

la_oveja · on May 23, 2024

essential tool

CJefferson · on May 23, 2024

WizTree also understands things like OneDrive and Dropbox, and know that files "stored in the cloud" aren't taking up any disc space -- WinDirStat thinks my drive is 140% full.

cm2187 · on May 23, 2024

What about hard links?

useless_foghorn · on May 23, 2024

Wiztree and WinDirStat will both double count hard links. I have a 12TB hard drive holding "17TB" because of sparse files and hard links. Windows file manager properties agree with Wiztree and WinDirStat as far as space used. I think the file manager looks for free space and calculates that separately, while Wiztree and WinDirStat are just adding up used space.

password4321 · on May 23, 2024

WizTree is no longer free for commercial use.

I believe version 3.38 was the last version that is completely "free as in beer" with optional donations.

faeriechangling · on May 23, 2024

> WizTree isn't open-source like WinDirStat but "free as in beer" with optional donations.

Which is enough for me to not use it because WinDirStat still only takes a minute. Cool software though.

barfbagginus · on May 24, 2024

Wiztree takes like 3 seconds where WDS takes 30. In realy big analysis and cleanup scenarios with rescans, it's enough to let you do your job faster. In every day scenarios, it removes any hesitation to visualize a system. It's basically free and near instant.

Fact is, WDS community must be kind of abandoned, or else it would be doing the same trick. It's SO much faster that it becomes a genuine quality of life improvement. I need it, and don't mind using a non free tool until the OSS solution has the capability.

8372049 · on May 23, 2024

Exactly this.

_zamorano_ · on May 23, 2024

Didn't try AltWinDirStat, but did try FastWinDirStat.

The thing is, FastWinDirStat uses a licensed propietary component. No problem for me, but the author did have some back and forth with another user on GitHub.

Seems FastWinDirStat license don't match with using a closed source library, or something...

As for its actual functioning, it does as it says. Works much faster than WinDirStat

actionfromafar · on May 23, 2024

Looks like a pretty clear violation of the WinDirStat license. They took WinDirStat which is GPL, linked it with some other proprieraty code and distributed the result.

(They could have been clear-ish (with caveats) by distributing only the source code and let the users do the compiling and linking, similarly to how you could download ZFS and build it into Linux. But you mustn't distribute the result further.)

brnt · on May 23, 2024

I'm a big Filelight fan. It used to not work well on NTFS volumes, it would miss files flagged Archived, has that been solved?

SSLy · on May 23, 2024

I wish there was a duplicate file finder that used the MFT scan to pre-process the data instead of the FS tree walk

barfbagginus · on May 24, 2024

Do you use czkawka?

https://github.com/qarmin/czkawka

I don't think it uses the MFT, but it's the fastest and most flexible open source dup finder as of last year or so

tyleo · on May 23, 2024

You’ve got me interested but I’m finding it quite annoying that WizTree doesn’t actually have pictures of the software UI on the website. At least not under any of the obvious links I’ve checked.

Tijdreiziger · on May 23, 2024

If you want to see screenshots of any piece of software, just search the name of the software on your favorite search engine and go to ‘images’.

(This might seem obvious, but it took me a long time to realize, hence why I’m passing the tip on.)

barfbagginus · on May 24, 2024

Just try it, if you hoard data or clean up client systems. It's the best Windows file size checker in 21 years, since windirstat appeared.

It's not just an improvement, it's freaking astonishing.

I sincerely hope that the author open sources it one day, or that mtf-based solutions come to open source.

It's just life changing and will change when you want to do a scan. It removed any sense of hesitation or time waste.

rldjbpin · on May 24, 2024

having used both on my pc, can attest to the speed claims. wiztree has yet to demonstrate annoying freeware/donationware pop ups during daily use.

Sakos · on May 23, 2024

Seeing a description directly in the README for the folders in the repo and their contents makes me really happy. I wish more projects would do that.

michaelcampbell · on May 24, 2024

Thanks for this; it is incredibly faster. Never heard of it before.

rkagerer · on May 23, 2024

FileLocator Pro is a good search tool that also uses the MFT.

burnte · on May 23, 2024

SpaceSniffer is a much easier to use tool.

barfbagginus · on May 24, 2024

SpaceSniffer's UI is less clunky, but Wiztree's scan is an order of magnitude faster. That kind of speed difference can affect when you're willing to use the tool.

I find myself much more willing to pop open wiz tree to get a quick view of my system or a particular storage folder.