[Soot-list] Multiple instances of Soot in a single host program
Wei Peng
pengw at umail.iu.edu
Sun Dec 28 12:18:38 EST 2014
Hi All,
This is my experience report; hope it will be helpful to others with
similar needs.
After some code digging and much trial/error, I come to the conclusion that
invocation of Soot has to be *fully* serialized, and Steven's suggested
workaround (using G.GlobalObjectGetter inner interface) is the only
reliable way to have multiple threads interact with Soot *concurrently*,
but unfortunately *not* in parallel. My attempt to minimize Soot's critical
section to the usage of global objects (i.e., the */v()) has failed with
numerous Exceptions.
The reason appears to be the lazy/on-demand resolution of many non-global
objects (e.g, SootClass, SootMethod), which boosts performance, but
unfortunately, spread the singleton */v() invocations throughout Soot's
lifecycle.
Below is a (Clojure) template of having multiple worker threads, each
holding a local G.GlobalObjectGetter instance, to serializedly but
concurrently interact with Soot. I have polished the logic a bit to include
a macro ("with-soot" in "example/helper.clj") and other helper routines to
ease the deal with Soot mutex and G.GlobalObjectGetter. I plan to release
the code on GitHub later.
Wei.
==== example/worker-thread.clj ====
(ns example.worker-thread
(:require [example.helper :as helper]))
(defn worker-thread []
(let [g-objgetter (helper/new-g-objgetter)]
;; this macro, defined in the helper namespace, wrap the body of soot
(helper/with-soot
;; use the current thread's Soot context
g-objgetter
;; no need to reset Soot context at the end --- as long as everyone
uses its own Context
false
;; the real work begins from here
;; set soot.options/Options (the var "options" is bound in
helper/with-soot
(doto options
(.set_src_prec (Options/src_prec_apk))
(.set_whole_program true)
;; other options follow
)
;; set soot.PhaseOptions (the var "phase-options" is bound in
helper/with-soot)
(doto phase-options
(.setPhaseOption "cg.cha" "enabled:true,verbose:true")
;; other phase options follow
)
;; run body packs (only "jb" is listed here)
(helper/run-body-packs :scene scene :pack-manager pack-manager
:body-packs ["jb"])
;; post body packs processing; the var "scene" can be is bound to
the Scene singleton, i.e., Scene.v()
)
==== example/helper.clj ====
(ns example.helper
(:import (soot G
G$GlobalObjectGetter
PhaseOptions
PackManager
Scene)
(soot.options Options)))
(def soot-mutex
"Soot mutex: Soot is unfortunately Singleton"
(Object.))
(defmacro with-soot
"wrap body with major Soot refs *at the call time*: g, scene,
pack-manager, options, phase-options; g can be (optionally) provided with
g-objgetter (nil to ask fetch the G *at the call time*); (G/reset) at the
end if \"reset?\" is true"
[g-objgetter reset? & body]
`(locking soot-mutex
(try
(when (instance? G$GlobalObjectGetter ~g-objgetter)
(G/setGlobalObjectGetter ~g-objgetter))
(let [~'g (G/v)
~'scene (Scene/v)
~'pack-manager (PackManager/v)
~'options (Options/v)
~'phase-options (PhaseOptions/v)]
~@body)
(finally
~(when reset?
`(G/reset))))))
(defn new-g-objgetter
"create a new Soot context (G$GlobalObjectGetter)"
[]
(let [g (new G)]
(reify G$GlobalObjectGetter
(getG [this] g)
(reset [this] (new G)))))
(defn get-application-classes
"get application classes in scene"
[scene]
(->> (.. scene getApplicationClasses snapshotIterator)
iterator-seq
set))
(defn get-application-methods
"get application methods in scene"
[scene]
(->> scene
get-application-classes
(remove #(.. ^SootClass % isPhantom))
(mapcat #(.. ^SootClass % getMethods))
set))
(defn map-class-bodies
"map classes to their method bodies"
[classes]
(for [class classes
:when (not (.. class isPhantom))
method (seq (.. class getMethods))
:when (and (.. method isConcrete)
(not (.. method isPhantom)))]
(.. method retrieveActiveBody)))
(defn run-body-packs
"run body packs over application classes"
[& {:keys [scene pack-manager body-packs]}]
(doto scene
(.loadNecessaryClasses))
(doseq [pack body-packs]
(when-let [pack (.. ^PackManager pack-manager (getPack ^String pack))]
(doseq [body (->> scene get-application-classes map-class-bodies)]
(.. pack (apply body))))))
Wei Peng
Department of Computer and Information Science
Indiana University-Purdue University, Indianapolis (IUPUI)
pengw at iupui.edu | www.cs.iupui.edu/~pengw/
>
> On Sat, Dec 27, 2014 at 11:45 AM, Steven Arzt <Steven.Arzt at cased.de>
> wrote:
>
>> Hi Wei,
>>
>>
>>
>> Soot was sadly never intended to support such use cases. We have issues
>> with those design decisions (that date back more than ten years) in our own
>> projects as well. The best workaround we have found so far is to
>> quick-switch between Soot instances. We create one copy of all singletons,
>> then reset Soot, load the next instance, save it again, etc. Whenever a
>> thread needs to access one of the singletons, we check which thread it is,
>> get the corresponding saved generic instances, and restore them. This means
>> that we cannot have real concurrency, but we can simulate the singletons
>> for every thread by copying back and forth. Support for this switching has
>> been added to the G class (look for “GlobalObjectGetter”).
>>
>>
>>
>> I know that this is not a real solution, but changing such low-level
>> design decisions is almost impossible in a project like Soot without
>> breaking almost all current applications that use Soot.
>>
>>
>>
>> Best regards,
>>
>> Steven
>>
>>
>>
>> *Von:* soot-list-bounces at CS.McGill.CA [mailto:
>> soot-list-bounces at CS.McGill.CA] *Im Auftrag von *Wei Peng
>> *Gesendet:* Freitag, 26. Dezember 2014 23:18
>> *An:* soot-list at CS.McGill.CA
>> *Betreff:* [Soot-list] Multiple instances of Soot in a single host
>> program
>>
>>
>>
>> Hi,
>>
>>
>>
>> First, thanks for open sourcing Soot and keeping on making it better
>> after so many years.
>>
>>
>>
>> I have searched but not found any answer to my following question.
>>
>>
>>
>> It appears to me that Soot is designed to be Singleton, evidenced by the
>> ubiquitous .v() calls for retrieving the root objects (Scene, Main, etc).
>> However, if I want to do *separate* analyses on *totally different set
>> of classes* in a *single host* program, the global states shared by
>> Singleton would interfere with each other.
>>
>>
>>
>> My questions are:
>>
>> * How can I host multiple instances of Soot in a single program?
>>
>> * If it is not possible without making intrusive changes to Soot, can I
>> somehow reset the global state?
>>
>>
>>
>> What I have found that maybe relevant:
>>
>> * G.v().reset()
>>
>>
>>
>> *Use case *I have a host program that uses Soot to construct
>> SootMethod/SootClass, on which the host program will extract info. The host
>> program takes a bunch of Android APK names from STDIN, and farms out
>> *isolated* analyses of the APKs to a pool of worker threads. One step in
>> these worker threads is to use Soot to construct SootMethod of the APK
>> (using the "jb" pack). However, the Singleton architecture of Soot would
>> prevent me from doing this.
>>
>>
>>
>> *Motivation* All the benefits of using a single JVM to do multiple
>> tasks: Saving JVM start time, using lightweight theads instead of OS
>> processes, cache warming, control parallelism through FixedThreadPool
>> instead of using GNU Make "-j" trick, etc.
>>
>>
>>
>> Wei.
>>
>>
>>
>> Wei Peng
>> Department of Computer and Information Science
>> Indiana University-Purdue University, Indianapolis (IUPUI)
>> pengw at iupui.edu | www.cs.iupui.edu/~pengw/
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.CS.McGill.CA/pipermail/soot-list/attachments/20141228/b0044f61/attachment.html
More information about the Soot-list
mailing list