Archives for category: Uncategorized

Burning Man is trying to get a ten-year agreement to continue holding the event in the Black Rock Desert. The Bureau of Land Management (BLM) has done a farcical Draft Environmental Impact Statement. Burning Man has analyzed the DEIS, and is calling for Burners to write comments. Cory Doctorow has written a terrific model letter.

I’m not going to repeat all the nonsense, but I’m pasting my letter below. Sheesh. C’mon folks, buckle down and write a letter. This is some serious bullshit.

———- Forwarded message ———
From: Me
Date: Sun, Apr 7, 2019 at 5:32 PM
Subject: environmental impact statement for Burning Man
To: <blm_nv_burningmaneis@blm.gov>
Cc: <eis@burningman.org>



Dear BLM, 

I have read an overview of the Burning Man DEIS with particular attention to the recommended mitigations. I have three comments and associated questions based on my ten years attending Burning Man. 


First, the call to increase security by requiring Burning Man to contract a private security service (PHS-1) is unnecessary and unjustified by the evidence presented. The “screening” foreseen in this claim is a search that is patently unconstitutional under the Fourth Amendment. The National Environmental Policy Act cannot be lawfully stretched to cover surveillance of this sort. It is grossly improper for BLM to request it.


The additional time required for the proposed screening of participants at the gate would be burdensome to the public and harmful to the environment. To be clear, Black Rock City is already pretty safe: in ten years I’ve had no issues whatsoever with firearms or anything else that required law enforcement. Indeed, the most serious safety risks I see each year are inevitably BLM Rangers and other law enforcement officials who are driving too fast.


Nearly every year I have to file notices with the Burning Man Rangers about unsafe driving by law enforcement; in 2018, I filed three notices. This happens both outside the city and inside the city when law enforcement speeds by pedestrians and bicyclists. The consistently irresponsible driving of BLM Rangers and Nevada county sheriffs also affects mitigation AQ-1. Just a few hours of observation out there makes clear that law enforcement vehicles speeding at well over 30 mph through unfenced areas are among biggest creators of dust problems.


What measures have you put in place to oversee the BLM Rangers to assure that they respect the safety and constitutional rights of the participants? I’m not asking that BLM Rangers “protect” participants’ safety. To the contrary, and to be clear, my question is: what is BLM doing to make sure that BLM Rangers don’t imperil people by unsafe driving or by unnecessary and violent policing? What training do the Rangers receive in protecting participants’ rights?


Second, the jersey barriers (PHS-3) are unnecessary. There are very few people who attempt to gain unauthorized entry into the event, and they are swiftly caught by Burning Man’s Gate, Perimeter, and Exodus staff. Have you considered how much energy would be required to manufacture and transport *nine miles* of jersey barriers? That’s about 19 million pounds of concrete and steel, perhaps one thousand 100+ mile round-trips in a flatbed semi trailer from Sparks or Reno. Have you done the environmental impact analysis on this “mitigation?” 


Third, the additional fluids (WHS-4) and wastewater (WHS-6) requirements are similarly unnecessary. I’ve built a number state-of-the-art “evapotron” towers to that eliminate about 200 gallons of greywater per week per tower, without leaking and without waste on the playa. Our evapotrons are regularly admired by the Earth Guardians. Consequently I’ve spent a lot of time over the years helping Burners capture, evaporate, and transport their greywater — and they’re pretty good at it. As a Burning Man greywater guru, I believe that your analysis is substantially in error.


To close, this Draft EIS seems to me a trumped-up list of invented problems. Burning Man has shown itself to be an extraordinarily good steward of public lands, bringing tens of thousands of people to a remote location, year after year, with an admirable health and safety record, while leaving no trace. The Draft EIS ignores this history. 


I look forward to receiving your plans for BLM Ranger retraining, an impact assessment of manufacturing and transporting 19 million pounds of steel and concrete, and an evidence-based, statistically rigorous analysis of the DEIS’s wastewater claims. 


Sincerely, 
me

My big archiving project is slow. It’s hard to iterate on solutions when each meaningful test can take 10+ hours, and in some cases, multiple days.

The most important bottleneck in the process is reading and writing to disks: this task is io-bound. The table below shows the read and write speeds according to the Blackmagic speed test. All tests are run in MacOS 10.14.

disk & interface read write type interface
MBP2018-internal 2533MB/s 2675MB/s ssd nvm-e
archives-2019 82MB/s 86MB/s 5400 usb-c
photos 80MB/s 88MB/s ssd (samsung T5) usb-c
archives-2018 81MB/s 86MB/s 5400 usb 3.1
MBP2013-internal 400MB/s 459MB/s ssd sata
GDrive ext 189MB/s 187MB/s 7200 tbolt2+RAID1
GDrive ext 184MB/s 173MB/s 7200 usb-c/3.0

The external speeds are consistent across both machines, and this test shows the very best case. In real-world copying, the speed falls to extremely slow speeds — sometimes less than 1MB/s — which I attribute to a combination of lots of hardlinks (see below) and in some directories, hundreds of thousands of tiny files. I’m working on these latter two questions, but still, these raw, best-case speeds seem to me inadequate. I’m not sure why these disks are so slow.

Update: I think the limiting factor is input/output operations per second (IOPS, or reported as tps in iostat). This wikipedia article suggests that spinny disks (as opposed to SSD disks) can sustain 100-200 IOPS/s. Finding a specific sector where a file is is one IOPS, so this effectively limits a disk to reading 100-200 files/s, even if the files are very small. This is really slow when there are millions of files. SSD disks are ridiculously better at this kind of task.

That said, my 500GB Samsung T5 isn’t doing any better at the Blackmagic test, so I’m still a little vague about what’s going on.

In re hardlinks: When I’m copying, I’m using something like rsync -rtlOWSH which copies hard-linked files as hard links on the destination filesystem. rsync has a tough time with hardlinks because it needs to keep a table of all the inodes it has seen during the run. Even as rsync eats up RAM, it is slowing down. I am writing a copy-by-inode tool to work around this problem.

One of the steps in my massive file-archiving project requires that I save all the paths with their associated inodes and sizes from each filesystem I intend to integrate. I’ve decided to save the info to a sqlite database (the link has a great tutorial: if you already know how to use SQL but need the specific sqlite idioms, this is a great page).

The table below shows several approaches to getting the filesystem data into the database. I’ll list the winning command here, then explain the alternatives:

find "$SRC" -type f -printf "%p|%i|%s\n" |\
	pv --line-mode -bta |\
	sqlite3 -bail "$FSLOCATION" ".import /dev/stdin paths"

This is gnu find, I’m not sure if the BSD find that ships with MacOS has the same options. You can install gnu find with homebrew, and this link shows you how to use the default name (i.e., find rather than homebrew’s gfind) to override the BSD find.

Anyway, find prints a pipe-delimited list of path, inode, and size to stdout; pv writes a nice progress message; and then sqlite imports directly from stdin. Note that you need to create the table (paths) before this step.

test speed comment
base 0.01s just setup
find | sqlite 0.17s very simple
find->tmp; sqlite import 0.26s not as clean but simple
find->tmp; python+sqlite import 0.22s python buffers better?
os.walk over dirs +sqlite import 0.35s find is much faster

This table shows results on a test directory of about 21G including about 30K files. The find-piped to-sqlite is considerably faster than the other options: it’s slower to redirect find to a temporary file, then import it; it’s a little better to have python read the temporary file then insert the values into sqlite (I think python parses the file faster than sqlite’s import does: I’m using pandas to parse the file); and then using python’s os.walk instead of find is much slower.

My guess is that the find | sqlite option benefits from a bit of concurrency and smart buffering. The shell (zsh, in this case) is getting a chunk of data from find and passing it to sqlite, letting find run while sqlite does the import. On a much bigger directory, I can see both find and sqlite using CPU time. Eventually everything slows to the speed of the slowest process, but the buffer is big and both can happen mostly at once.

This is a big help for my coming tool which is a mass file-copy script that doesn’t choke on tons of hardlinks (which cp and rsync most definitely do).

Among the files I want to organize in this giant archiving project are photos. These could be scanned images of old paper photos, jpgs from my phone or shared with me, or jpgs and raw files from a couple of decades of electronic photography.

The problem is that the files are scattered across backup systems that go back decades. To collect all the images, I wrote a little python script called getpix.py. (Note that the filename is a hyperlink to the GitHub gist which I’m double-linking because wordpress doesn’t format a code literal+hyperlink in an intuitive way).

Anyway: the script recursively descends a source directory and moves every image it finds to a destination directory in the format bydate/YYYY/MM/DD.

At every directory, the script runs exiv2 on every file (this could be improved by making the subprocess call to find smarter). Files that have a timestamp use it for the directory sorting. If not, and there’s a timestamp in the path (which Apple photo directories often keep), that timestamp will be used. One could add a final fallback date to the file ctime, but at least for me, the file metadata is so badly mangled that it provokes more confusion than enlightenment.

Files with no dates are sorted into no_date.

The resulting bydate structure can be dragged into the rest of the archiving process. There will be lots and lots and lots of duplicate images, and this is a gigantic PITA. There is a Right Way To Do It: use PhotoSweeper. This app will review all the images, link duplicates, and delete extras using configurable and intelligent defaults.

Note to self: do not try to do this inside Lightroom, what a mess that is.

I’m left with about 30K images, which Lightroom can handle without even turning on the CPU fan. This is a step forward.

I have a lot of data from a lot of years: about 5TB with around 8 million files. It’s very redundant, lots of copies of the same stuff. Many of the files are tiny, e.g., 100,000 1-2KB files in Maildir.

Most of the data are now on medium-sized external disks (2-8TB each) accessed via USB or Thunderbolt. It’s time to get everything onto a small set of usable disks (I’ve tried this before and I didn’t get very far).

One of the things that slows me down is that no matter how I set up the copy (cp, rsync, Finder), after a few minutes, the copy slows to a crawl. These are reasonably fast disks on USB3.0 or Thunderbolt2.0. The r/w speed on the disk should be around 150MB/s, and the connection is 5Gb/s, but I’d often see read speeds around 0.5MB/s. Ooof. You’re not going to move a terabyte at that speed.

And now I think I know what’s happening: the directories get disorganized. I am frustrated that I can’t figure out what this means, but I discovered that after running DiskWarrior on the offending drive, it’s now copying at 50-100MB/s (I’m using iostat to watch the r/w speeds). A big, big win.

Of course, APFS makes this useful knowledge nearly obsolete. Ah, the story of my life, learning useful stuff just as it becomes a kind of vintage affectation.

Gotta go, I’m going to write some shell scripts to make my terminal prompt look cool.

You thought I was kidding?

Every year at Burning Man, we build a Temple of memories of what we’ve lost during the previous year. The Temple itself is a fabulous structure of wood that lifts our thoughts and provides a place for our physical tokens of loss. We place essays, epigrams, photographs, and objects. We come to remember, to mourn, and to celebrate. And then, in a moment of collective solemnity unlike any other during our wild week, we burn it.

FullSizeRender-12I added the essay linked below, printed 11×17″, mounted on posterboard, hinged like a book. You can download the full text as a pdf. I’ve already left the playa, and I’m posting it here, more or less as the Temple is scheduled to burn.

suicide-and-community

Patrick Ball, 3 September 2017.

The animatronic tail I’m making for ❤ has a pretty simple UI: there’s one button. If you push the button, the tail wags. But if you double-click the button, the tail goes nuts.

Alas, buttons are noisy. As the button is pushed or released, there are many tiny little voltage surges as microscopic ridges and grooves in the switch touch and release as the connection is definitively made or broken. The Arduino sees these little surges and drops and thinks that the button is being pushed and released a few times in 10-20 milliseconds. This is called bouncing, and it is a Bad Thing.

There are tons of approaches for debouncing buttons in software. These all basically come down to ignoring the button for a while after detecting a change. That means that any additional real button presses also get ignored. It’s pretty hard to get the software tuned to ignore bouncing but detect double-clicking. People claim to have done it, but it didn’t work for me.

Debouncing in hardware is a little more involved: it involves building a small circuit with a resistor, a capacitor, and an op-amp integrated circuit. The circuit is called a Schmitt trigger, and the point is to sharpen a noisy set of voltage shifts into a nice square wave: either the button is pushed or it is not pushed, there is no bouncy jiggling.

There are several ways to build a Schmitt trigger, and my hack is shown in the photo. I used a 10KΩ resisistor, a 10㎌ capacitor, and a LMP358 op-amp chip.

IMG_2385

I got the design from Jeremy Blum’s excellent tutorial on this topic. I’ve tested it, and it works incredibly well. Very smooth, no extraneous clicks, and the software no longer has to ignore the button in delay patterns. Yay!

 

Once upon a time, someone said about Unix regular expressions that if you think can can use them to solve a problem, now you have two problems. What he meant is that regexes are at the same time so powerful and so complicated that they’re hard to debug and hard to be sure that you know what they’re really doing. I feel that way about dremels.

IMG_2379

I use the dremel to cut stuff, like plastic boxes, and in this case, a seriously overspec’d steel motor mount I’m using to fix a stepper motor in a little box to create an animatronic tail for my ❤.

But whenever I get the dremel going, I feel like it’s within an instant of flying to pieces, maiming me, and more importantly, like it might damage whatever I’m cutting. A safety course might be helpful, someday.

I’ve been pretty curious how much energy the mirror uses. However, it’s taken me a long time to figure out how to measure it. For unrelated reasons, I’ve got an arduino that I figured I can use for the voltage measurement.

IMG_2382

The power to the Arduino comes straight from the batteries, so I put a little voltage sensor across the battery feed’s positive + negative. It sends an analog signal to the arduino, and the arduino parses the analog input and sends sends logging info to the Raspberry Pi via a serial connection (note that the Arduino is set to 3.3v! otherwise the serial connection could damage the Pi).

There’s a daemon running on the Pi that parses the Arduino’s serial feed and writes tiny data files to a tmpfs dir. Each time the data reading daemon runs, it writes a file to the tmpfs with one line of data (actually, it writes a temp file then renames it; renaming is atomic, and so avoids a race condition between the serial parser and the data readers). Another daemon reads the data line and logs it. The two-part process means that one daemon can parse the serial data while several others consume it.

In this case, we want the log of the voltages. I set up the test with a single fully-charged battery (a 12v/100Ah deep cycle). Here’s what I found:

blinkybox-volts-by-hours

The mirror ran for over day! That means that having two batteries that are charged by solar during the daytime is much more capacity than we need to run it for 10 hours at night. Good.

 

One of my favorite project batteries is this 7.4v/4.4Ah Li-ion from BatterySpace. It’s basically four 3.7v batteries (approximately AA sized) packed in heavy shrink wrap.

IMG_2380

The original is the white and green on the left. I needed it to fit the form on the right so it will sit nicely in the project box for the animatronic tail. So I cut it open, and found a little more complexity than I expected.

The batteries are in two packs of 2 each, wired in parallel; the packs are then wired in series. They connect to a little PCB that adds charging and short protection.

The trick is that the batteries have foil tape for the contacts, and I couldn’t figure out how to solder to the foil. I made 3 copper plates with solder blobs for contacts. All 4 batteries connect on one end (two negative and two positive) with a square copper plate. The solder blobs on the plate press into the battery contacts, and I taped it tightly. I soldered a wire to the plate which connects to the common pad on the PCB.

On the positive and negative ends, I did something similar with two rectangular copper plates, one taped across the two positives, the other across the two negatives. Lots of tape prevents shorts, and now I have a battery that fits nicely in the box.