archive-fm.com » FM » G » GREER.FM

Total: 127

Choose link from "Titles, links and description words view":

Or switch to "Titles and links view".
  • Geoff's site: Profiling Ag. Writing My Own Scandir
    called scandir on each directory Then scandir called filename filter on every entry in the directory To figure out if a file should be ignored filename filter called fnmatch on every entry in the global char ignore patterns This set up had several problems scandir didn t let me pass any useful state to filename filter The filter could only base its decision on the dirent and any globals ignore patterns was just an array of strings It couldn t keep track of a hierarchy of ignore files in subdirectories This made some ignore entries behave incorrectly issue 43 This also hurt performance Fixing these issues required rejiggering some things First I wrote my own scandir The most important difference is that my version lets you pass a pointer to the filter function This pointer could be to say a struct containing a hierarchy of ignore patterns Surprise surprise the next thing I did was make a struct for ignore patterns struct ignores char names Non regex ignore lines Sorted so we can binary search them size t names len char regexes For patterns that need fnmatch size t regexes len struct ignores parent This is sort of an unusual structure Parents don t have pointers to their children but they don t need to I simply allocate the ignore struct search the directory then free the struct This is done around line 340 of search c Searching is recursive so children are freed before their parents The final change was to rewrite filename filter It calls fnmatch on every entry in the ignore struct passed to it If none of those match and ig parent isn t NULL it repeats the process with the parent ignore struct and so on until it reaches the top All in all not

    Original URL path: http://geoff.greer.fm/2012/09/03/profiling-ag-writing-my-own-scandir/ (2016-02-15)
    Open archived version from archive


  • Geoff's site: S3 Logging and Analytics
    S3 Management Console Take a look at a bucket s properties and click on the logging tab I just saved you half an hour of messing with s3curl pl and XML ACLs You re welcome Enable logging and choose a bucket to save logs to I suggest you set a target prefix with a slash in the name If you don t there will be no easy way to select all logs in the AWS management console making batch changes much harder Hooray you ve got S3 saving logs in a logging bucket Now the trick is to download and parse them With the help of libcloud I wrote a little Python script to download all logs usr bin env python import hashlib from libcloud storage types import Provider from libcloud storage providers import get driver key access key id secret secret access key container name blah logs path logs access log delete files False storage driver get driver Provider S3 provider storage driver key secret container provider get container container name container objects provider list container objects container def md5 path f open path data True md5sum hashlib md5 while data data f read 2 20 md5sum update data f close return md5sum hexdigest for obj in container objects if obj name 0 len path path try file md5 md5 obj name except Exception e file md5 0 if file md5 obj hash obj download obj name if delete files obj delete S3 s log files are small but numerous I didn t want to make things more complicated by using Twisted so this script doesn t download them in parallel The first run can take a while so be patient Add some print s if you want to see progress The script only downloads files that aren t

    Original URL path: http://geoff.greer.fm/2012/08/28/s3-logging-and-analytics/ (2016-02-15)
    Open archived version from archive

  • Geoff's site: The Silver Searcher: Benchmarking Revisions
    ag stats i blahblahblah code 2 1 grep seconds tail n 1 awk print 1 echo REV TIME1 TIME2 TIME3 6a38fb74 is the first rev that supports stats REV LIST git rev list 6a38fb74 master for rev in REV LIST do benchmark rev rev done This script runs three benchmarks on each revision Case sensitive string matching regular expression matching and case insensitive string matching The results surprised me Hover over the lines and annotations for more information about each revision Zero values are due to incorrect behavior or failed builds For personal projects like Ag I don t spend much effort making sure master is always deployable Tagged releases are another matter of course Graphing the performance over time makes regressions obvious One change made the benchmarks double in execution time from 2 seconds to 4 For comparison grep r takes 11 seconds and spits out tons of useless matches Ack takes 20 seconds The first thing that caught my eye was the spike labelled B I found that all my hard work improving performance was negated by a single commit 13f1ab69 This commit called fnmatch twice as much as previous versions Over 50 of execution time was already spent in fnmatch so it really hurt performance The drop at D is from me backing out the change until I can write something that doesn t slow things down Looking at other specific changes I can also see that 43886f9b annotation C improved string matching performance by 30 This was not intended I was cleaning up some code and fixed an off by one error that slightly impacted performance It certainly wasn t going to cause a 30 difference After git blaming I found the commit that introduced the problem 01ce38f7 annotation A This was quite a stealthy performance

    Original URL path: http://geoff.greer.fm/2012/08/25/the-silver-searcher-benchmarking-revisions/ (2016-02-15)
    Open archived version from archive

  • Geoff's site: Character encoding bugs are 𝒜wesome!
    TO CHARACTER SET utf8mb4 COLLATE utf8mb4 bin ERROR 1709 HY000 Index column size too large The maximum column size is 767 bytes This is when I discovered that InnoDB limits index columns to 767 bytes Why is this suddenly an issue Because changing the charset also changes the number of bytes needed to store a given string With MySQL s utf8 charset each character could use up to 3 bytes With utf8mb4 that goes up to 4 bytes If you have an index on a 255 character column that would be 765 bytes with utf8 Just under the limit Switching to utf8mb4 increases that index column to 1020 bytes 4 255 The solution Delete the old index alter the table then create a new index that is only on the first 191 characters of the column How does one find the offending index SHOW INDEXES FROM foo will show all indexes on the table Combine that with DESCRIBE foo and you can figure out which indexes are on columns longer than 191 characters With that out of the way back to the action mysql DROP INDEX foo 1234 on foo Query OK 0 rows affected 0 00 sec Records 0 Duplicates 0 Warnings 0 mysql ALTER TABLE foo CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4 bin Query OK 0 rows affected 0 11 sec Records 0 Duplicates 0 Warnings 0 CREATE INDEX foo 1234 ON foo baz 191 You might say OK now we re finished right Ehh it s not so simple MySQL s utf8 charset is case insensitive The utf8mb4 charset is case sensitive The implications are vast This change in constraints forces you to sanitize the data currently in your database then make sure you don t insert anything with the wrong casing At work there was

    Original URL path: http://geoff.greer.fm/2012/08/12/character-encoding-bugs-are-%F0%9D%92%9Cwesome/ (2016-02-15)
    Open archived version from archive

  • Geoff's site: Open-source Your Abandonware
    lose interest Maybe they re working on a rewrite Maybe their priorities changed Meanwhile the editor languishes Bugs don t get fixed Promised features never show up The end result is a community using a piece of abandonware as their main editor If they had the source code the community could improve their editor But they don t so they can t Slowly the community shrinks People get fed up with a bug or a lacking feature They switch to another editor They re learn keyboard shortcuts They spend hours tweaking and customizing They add and modify plugins It s likely months before they re as productive as they used to be This happens to thousands of users Thousands of users experiencing countless hours of frustration And it could be avoided if authors open sourced these abandoned projects If you are the author of a closed source editor please pledge to release your source code when you stop releasing updates If you use a closed source editor contact the authors Ask them to pledge to release their code Martin Hedenfalk open sourced his editor Vico I hope others follow suit I doubt he ll do it but I d love

    Original URL path: http://geoff.greer.fm/2012/07/09/open-source-your-abandonware/ (2016-02-15)
    Open archived version from archive

  • Geoff's site: Jekyll Gallery Plugin
    in years Most of my recent work has been C Python and Node js I ve made my views on Node quite clear Python is what I typically use for getting things done and C is my weapon of choice when speed is essential Ruby is similar to Python but it has some nice syntactic sugar There are annoyances of course but Ruby s surprises have been mostly pleasant I

    Original URL path: http://geoff.greer.fm/2012/07/02/jekyll-gallery-plugin/ (2016-02-15)
    Open archived version from archive

  • Geoff's site: Linksplosion
    knew this feature already existed in other extensions but I wanted something less bloated Also I wanted to get more experience writing Chrome extensions I succeeded on both fronts I learned useful things such as Chrome s context menu API and discovered the particularly handy Element querySelectorAll 2 Originally I had a recursive function that walked the DOM looking for anchor tags I knew it was bad but couldn t

    Original URL path: http://geoff.greer.fm/2012/06/14/linksplosion/ (2016-02-15)
    Open archived version from archive

  • Geoff's site: Node.js: Dealing with Errors
    going crash if any callback in a chain throws an error To prevent this you have to get used to wrapping tons of stuff in try catch or engaging in ludicrously defensive programming Before some self proclaimed Node expert tells me to stop throwing errors in callbacks I ll point out that the Node js API docs contain plenty of examples Basic stuff like reading files uses this pattern More importantly it s easy to unintentionally throw an error Accessing a property of an undefined variable is a common mistake and will throw a ReferenceError Error isolation I know people have said this before but it s insane that Node runs its own web server By default a single unhandled error will cause your web server to crash Here s an example var http require http http createServer function req res var result decodeURIComponent req url chop off at the beginning result result slice 1 result result toUpperCase SHOUT res writeHead 200 Content Type text plain res end result n listen 5000 127 0 0 1 This web server capitalizes whatever you send to it Like so ggreer carbon curl 127 0 0 1 5000 hello 20there HELLO THERE Neat eh Well let s try throwing some more complicated data at it ggreer carbon curl 127 0 0 1 5000 give 20110 curl 52 Empty reply from server What s this Going back to the other terminal I see that Node js has crashed ggreer carbon node error example js Users ggreer error example js 4 var result decodeURIComponent req url URIError URI malformed at Server anonymous Users ggreer error example js 4 16 at Server emit events js 70 17 at HTTPParser onIncoming http js 1514 12 at HTTPParser onHeadersComplete http js 102 31 at Socket ondata http js

    Original URL path: http://geoff.greer.fm/2012/06/10/nodejs-dealing-with-errors/ (2016-02-15)
    Open archived version from archive