All it takes is a cursory search on some Apple discussion boards or mailing lists for “AFP crash” or “DirectoryService crash” to turn up a load of discussions on the topic.
The summary of the problem is basically this: The
DirectoryService process crashes for some reason, then gets restarted by
launchd. However, AFP (or more specifically, the
AppleFileServer process) appears to not regain its connection to it. This prevents any new AFP connections from being able to authenticate, and existing ones are unable to re-authenticate. Couple this with AFP mounted home directories, and now your users can’t log in to their workstations, or their existing session hangs.
In said discussions there are dozens of proposed workarounds. These include: Periodically HUP’ing the
AppleFileServer process, setting up some crazy firewall rules, periodically toggling guest access, and numerous other things. I personally have tried many of them and can confidently say that none of them are a good solution. The toggling seems to mitigate the problem to some degree, but eventually things still come down hard.
One fix that appeared promising which we tried recently is not running Open Directory (of the network variety) on the same host as AFP. Fortunately we had a second XServe which was acting as an OD replica and not much else, so we demoted it to a server which is just connected to OD, and moved out AFP home share there. This seemed to work fine for at least a day, but then this weekend the
DirectoryService process crashed yet again, causing the same problem as before.e
The thing that really blows my mind about this whole issue is that people have been reporting it since November of last year. That’s 5 whole months, and still no sign of a fix from Apple! Say what you will about other companies being slow to respond to problems, I’ve never seen a major issue like this take so long to be fixed by anyone else.
With OS X 10.5.3 being seeded to developers in the last few days, I hope that Apple finally gets on the ball and fixes this glaring problem! This is definitely one of the most frustrating problems I’ve encountered during my time in the computing industry..