A wooden toothpick is probably a bit too thick. You’d want something thin enough that it can be inserted without touching the electrical contacts. If you do have something plastic then that’s probably better, but if you do the cleaning when the device is off the USB port should be unpowered and there shouldn’t be a risk of causing a short, and modern USB ports are quite well protected again shorts anyway so it’s very unlikely to cause damage just by being conductive. You mainly want something that is long and thin enough to get all the way to the bottom of the port without having to apply any force. If the only things you have that are long and thin enough to reach the bottom of the port without having to be forced in are made of metal, then that’s still a safer option than jamming something too thick into the port that can deform the center contacts.
x86 has bit manipulation instructions for any bit. If you have a book stored in bit 5 it doesn’t need to do anything masking, it can just directly check the state of bit 5. If you do masking in a low-level programming language to access individual bits then the compiler optimization will almost always change them to the corresponding bit manipulation instructions.
So there’s not even a performance impact if you’re cycle limited. If you have to operate on a large number of bools then packing 8 of them in bytes can sometimes actually improve performance, as then you can more efficiently use the cache. Though unless you’re working with thousands of bools in a fast running loop you’re likely not going to really notice the difference.
But most bool implementations still end up wasting 7 out of 8 bits (or sometimes even 15 out of 16 or 31 out of 32 to align to the word size of the device) simply because that generally produces the most readable code. Programming languages are not only designed for computers, but also for humans to work on and maintain, and waisting bits in a bool happens to be more optimal for keeping code readable and maintainable.