Thursday, October 21, 2010

sorting clojure maps by value

So, here's the setup. I needed to sort a clojure map by the value stored in the map, not by key as is the case for sorted map (the values aren't unique, so a sorted-map won't work). Basically I have a whole bunch of text and I want to keep a count of the number of occurrences for each word. I'll probably post the tokenization code a little later, but if anyone is interested it lives here. So we have a big soup of unique keys and some value associated with the key. Awesome. Let's sort it.

(into (sorted-map-by 
  (fn [key1 key2] (compare (super-map key2) (super-map key1)))) 
super-map)
*btw, the above code can be found verbatim here.

So that works, right? Sure, unless of course duplicate values reside in the map, in which case you'll be short some entries. So here's my seat of the pants solution, slightly tested as it is.
(defn omg-sort
  "create a map sorted in descending order, first by value, then by key"
  [super-map]
  (into (sorted-map-by 
    (fn [key1 key2] 
      (let [val1 (super-map key1) val2 (super-map key2)] 
        (cond 
          (= val1 val2) (.compareTo key2 key1) 
          (< val1 val2) 1 
          :else -1)))) 
  super-map)
)

This works by sorting by value, and then by key if the values are the same. As always, I welcome any suggestions, refinements, improvements.

0 comments:

Post a Comment