← Back to team overview

launchpad-reviewers team mailing list archive

[Merge] lp:~jelmer/launchpad/code-import-use-safe-open into lp:launchpad

 

Jelmer Vernooij has proposed merging lp:~jelmer/launchpad/code-import-use-safe-open into lp:launchpad.

Requested reviews:
  Launchpad code reviewers (launchpad-reviewers)

For more details, see:
https://code.launchpad.net/~jelmer/launchpad/code-import-use-safe-open/+merge/72787

At the moment all code import URLs are checked when the import is registered - if e.g. a user enters a file:// URL it is rejected. 

This mechanism is no longer sufficient once we start importing bzr branches with the code importers, as bzr branches can be stacked on other branches or be references to other branches, and these URLs have to be checked as well. If we don't check these URLs, users might try to stack on private Launchpad branches or branches on the local disk, such as /etc.

Git repositories can also support some form of stacking ("alternates"), which involves opening another repository (Dulwich will probably support opening these kinds of repositories in the future).

This branch adds a "opener policy" to all code import workers, which checks all URLs that are opened. By default the "CodeImportBranchOpenPolicy" is used, which behaves similar to the open policy used by the code puller for bzr branches at the moment.

The test suite overrides the opener policy to be AcceptAnythingPolicy, as it uses local Git, Mercurial and Subversion repositories for testing the import workers.
-- 
https://code.launchpad.net/~jelmer/launchpad/code-import-use-safe-open/+merge/72787
Your team Launchpad code reviewers is requested to review the proposed merge of lp:~jelmer/launchpad/code-import-use-safe-open into lp:launchpad.
=== modified file 'lib/lp/code/enums.py'
--- lib/lp/code/enums.py	2011-06-28 21:13:33 +0000
+++ lib/lp/code/enums.py	2011-08-24 20:46:19 +0000
@@ -811,6 +811,13 @@
         Bazaar or the Bazaar foreign branch support.
         """)
 
+    FAILURE_FORBIDDEN = DBItem(240, """
+        Forbidden URL
+
+        The import failed because the URL of the branch that is imported
+        or the URL of one of the branches that it references is blacklisted.
+        """)
+
     RECLAIMED = DBItem(310, """
         Job reclaimed
 

=== modified file 'lib/lp/codehosting/codeimport/foreigntree.py'
--- lib/lp/codehosting/codeimport/foreigntree.py	2009-06-25 04:06:00 +0000
+++ lib/lp/codehosting/codeimport/foreigntree.py	2011-08-24 20:46:19 +0000
@@ -7,10 +7,11 @@
 __all__ = ['CVSWorkingTree', 'SubversionWorkingTree']
 
 import os
-import subprocess
 
 import CVS
-import pysvn
+import subvertpy
+import subvertpy.client
+import subvertpy.ra
 
 
 class CVSWorkingTree:
@@ -52,25 +53,23 @@
         self.remote_url = url
         self.local_path = path
 
+    def _get_client(self):
+        username_provider = subvertpy.ra.get_username_provider()
+        auth = subvertpy.ra.Auth([username_provider])
+        auth.set_parameter(subvertpy.AUTH_PARAM_DEFAULT_USERNAME, "lptest2")
+        return subvertpy.client.Client(auth=auth)
+
     def checkout(self):
-        # XXX: JonathanLange 2008-02-19: Can't do this with cscvs yet because
-        # its Repository class assumes that the repository lives on the local
-        # filesystem.
-        svn_client = pysvn.Client()
-        svn_client.checkout(
-            self.remote_url, self.local_path, ignore_externals=True)
+        client = self._get_client()
+        client.checkout(
+            self.remote_url, self.local_path, rev="HEAD",
+            ignore_externals=True)
 
     def commit(self):
-        client = pysvn.Client()
-        client.checkin(self.local_path, 'Log message', recurse=True)
+        client = self._get_client()
+        client.log_msg_func = lambda c: 'Log message'
+        client.commit([self.local_path], recurse=True)
 
     def update(self):
-        # XXX: David Allouche 2006-01-31 bug=82483: A bug in
-        # pysvn prevents us from ignoring svn:externals. We work
-        # around it by shelling out to svn. When cscvs no longer
-        # uses pysvn, we will use the cscvs API again.
-        arguments = ['svn', 'update', '--ignore-externals']
-        retcode = subprocess.call(
-            arguments, cwd=self.local_path, stdout=open('/dev/null', 'w'))
-        if retcode != 0:
-            raise RuntimeError("svn update failed with code %s" % retcode)
+        client = self._get_client()
+        client.update(self.local_path, "HEAD", True, True)

=== modified file 'lib/lp/codehosting/codeimport/tests/test_foreigntree.py'
--- lib/lp/codehosting/codeimport/tests/test_foreigntree.py	2011-08-12 11:37:08 +0000
+++ lib/lp/codehosting/codeimport/tests/test_foreigntree.py	2011-08-24 20:46:19 +0000
@@ -10,7 +10,10 @@
 
 from bzrlib.tests import TestCaseWithTransport
 import CVS
-import pysvn
+
+import subvertpy.client
+import subvertpy.ra
+import subvertpy.wc
 
 from canonical.testing.layers import BaseLayer
 from lp.codehosting.codeimport.foreigntree import (
@@ -33,11 +36,12 @@
         :param original_url: The URL of the Subversion branch.
         :param new_path: The path of the checkout.
         """
-        client = pysvn.Client()
-        [(path, local_info)] = client.info2(new_path, recurse=False)
-        [(path, remote_info)] = client.info2(original_url, recurse=False)
-        self.assertEqual(original_url, local_info['URL'])
-        self.assertEqual(remote_info['rev'].number, local_info['rev'].number)
+        working_copy = subvertpy.wc.WorkingCopy(None, new_path)
+        entry = working_copy.entry(new_path)
+
+        remote = subvertpy.ra.RemoteAccess(original_url)
+        self.assertEqual(original_url, entry.url)
+        self.assertEqual(entry.revision, remote.get_latest_revnum())
 
     def setUp(self):
         TestCaseWithTransport.setUp(self)
@@ -96,7 +100,7 @@
         tree2 = SubversionWorkingTree(self.svn_branch_url, 'tree2')
         tree2.checkout()
 
-        client = pysvn.Client()
+        client = subvertpy.client.Client()
         client.propset(
             'svn:externals', 'external http://foo.invalid/svn/something',
             tree.local_path)

=== modified file 'lib/lp/codehosting/codeimport/tests/test_worker.py'
--- lib/lp/codehosting/codeimport/tests/test_worker.py	2011-08-19 03:58:35 +0000
+++ lib/lp/codehosting/codeimport/tests/test_worker.py	2011-08-24 20:46:19 +0000
@@ -36,11 +36,18 @@
     tree as CVSTree,
     )
 from dulwich.repo import Repo as GitRepo
-import pysvn
+import subvertpy
+import subvertpy.client
+import subvertpy.ra
 
 from canonical.config import config
 from canonical.testing.layers import BaseLayer
 from lp.codehosting import load_optional_plugin
+from lp.codehosting.safe_open import (
+    AcceptAnythingPolicy,
+    BadUrl,
+    SafeBranchOpener,
+    )
 from lp.codehosting.codeimport.tarball import (
     create_tarball,
     extract_tarball,
@@ -54,6 +61,7 @@
 from lp.codehosting.codeimport.worker import (
     BazaarBranchStore,
     BzrSvnImportWorker,
+    CodeImportBranchOpenPolicy,
     CodeImportWorkerExitCode,
     CSCVSImportWorker,
     ForeignTreeStore,
@@ -107,6 +115,7 @@
     def setUp(self):
         TestCaseWithTransport.setUp(self)
         self.disable_directory_isolation()
+        SafeBranchOpener.install_hook()
 
     def assertDirectoryTreesEqual(self, directory1, directory2):
         """Assert that `directory1` has the same structure as `directory2`.
@@ -611,7 +620,8 @@
         """Make an ImportWorker."""
         return ImportWorker(
             self.source_details, self.get_transport('import_data'),
-            self.makeBazaarBranchStore(), logging.getLogger("silent"))
+            self.makeBazaarBranchStore(), logging.getLogger("silent"),
+            AcceptAnythingPolicy())
 
     def test_construct(self):
         # When we construct an ImportWorker, it has a CodeImportSourceDetails
@@ -647,7 +657,7 @@
         """Make a CSCVSImportWorker."""
         return CSCVSImportWorker(
             self.source_details, self.get_transport('import_data'), None,
-            logging.getLogger("silent"))
+            logging.getLogger("silent"), opener_policy=AcceptAnythingPolicy())
 
     def test_getForeignTree(self):
         # getForeignTree returns an object that represents the 'foreign'
@@ -676,7 +686,8 @@
         source_details = self.factory.makeCodeImportSourceDetails()
         return GitImportWorker(
             source_details, self.get_transport('import_data'),
-            self.makeBazaarBranchStore(), logging.getLogger("silent"))
+            self.makeBazaarBranchStore(), logging.getLogger("silent"),
+            opener_policy=AcceptAnythingPolicy())
 
     def test_pushBazaarBranch_saves_git_cache(self):
         # GitImportWorker.pushBazaarBranch saves a tarball of the git cache
@@ -764,7 +775,7 @@
             self.get_transport('bazaar_store'))
         self.foreign_commit_count = 0
 
-    def makeImportWorker(self, source_details):
+    def makeImportWorker(self, source_details, opener_policy):
         """Make a new `ImportWorker`.
 
         Override this in your subclass.
@@ -803,7 +814,8 @@
         # Running the worker on a branch that hasn't been imported yet imports
         # the branch.
         worker = self.makeImportWorker(self.makeSourceDetails(
-            'trunk', [('README', 'Original contents')]))
+            'trunk', [('README', 'Original contents')]),
+            opener_policy=AcceptAnythingPolicy())
         worker.run()
         branch = self.getStoredBazaarBranch(worker)
         self.assertEqual(
@@ -812,7 +824,8 @@
     def test_sync(self):
         # Do an import.
         worker = self.makeImportWorker(self.makeSourceDetails(
-            'trunk', [('README', 'Original contents')]))
+            'trunk', [('README', 'Original contents')]),
+            opener_policy=AcceptAnythingPolicy())
         worker.run()
         branch = self.getStoredBazaarBranch(worker)
         self.assertEqual(
@@ -841,7 +854,8 @@
             config.root, 'scripts', 'code-import-worker.py')
         output = tempfile.TemporaryFile()
         retcode = subprocess.call(
-            [script_path] + source_details.asArguments(),
+            [script_path, '--access-policy=anything'] +
+            source_details.asArguments(),
             stderr=output, stdout=output)
         self.assertEqual(retcode, 0)
 
@@ -879,11 +893,13 @@
             config.root, 'scripts', 'code-import-worker.py')
         output = tempfile.TemporaryFile()
         retcode = subprocess.call(
-            [script_path] + source_details.asArguments(),
+            [script_path, '--access-policy=anything'] +
+            source_details.asArguments(),
             stderr=output, stdout=output)
         self.assertEqual(retcode, CodeImportWorkerExitCode.SUCCESS)
         retcode = subprocess.call(
-            [script_path] + source_details.asArguments(),
+            [script_path, '--access-policy=anything'] +
+            source_details.asArguments(),
             stderr=output, stdout=output)
         self.assertEqual(retcode, CodeImportWorkerExitCode.SUCCESS_NOCHANGE)
 
@@ -898,11 +914,11 @@
         """
         TestActualImportMixin.setUpImport(self)
 
-    def makeImportWorker(self, source_details):
+    def makeImportWorker(self, source_details, opener_policy):
         """Make a new `ImportWorker`."""
         return CSCVSImportWorker(
             source_details, self.get_transport('foreign_store'),
-            self.bazaar_store, logging.getLogger())
+            self.bazaar_store, logging.getLogger(), opener_policy)
 
 
 class TestCVSImport(WorkerTest, CSCVSActualImportMixin):
@@ -945,13 +961,18 @@
 
     def makeForeignCommit(self, source_details):
         """Change the foreign tree."""
-        client = pysvn.Client()
-        client.checkout(source_details.url, 'working_tree')
+        auth = subvertpy.ra.Auth([subvertpy.ra.get_username_provider()])
+        auth.set_parameter(subvertpy.AUTH_PARAM_DEFAULT_USERNAME, "lptest2")
+        client = subvertpy.client.Client(auth=auth)
+        client.checkout(source_details.url, 'working_tree', "HEAD")
         file = open('working_tree/newfile', 'w')
         file.write('No real content\n')
         file.close()
         client.add('working_tree/newfile')
-        client.checkin('working_tree', 'Add a file', recurse=True)
+        client.log_msg_func = lambda c: 'Add a file'
+        (revnum, date, author) = client.commit(['working_tree'], recurse=True)
+        # CSCVS breaks on commits without an author, so make sure there is one.
+        self.assertIsNot(None, author)
         self.foreign_commit_count += 1
         shutil.rmtree('working_tree')
 
@@ -1001,23 +1022,35 @@
         if self.rcstype in ('git', 'bzr-svn', 'hg'):
             args['url'] = reference_url
         else:
-            raise AssertionError("unexpected rcs_type %r" % self.rcs_type)
+            raise AssertionError("unexpected rcs_type %r" % self.rcstype)
         source_details = self.factory.makeCodeImportSourceDetails(**args)
-        worker = self.makeImportWorker(source_details)
+        worker = self.makeImportWorker(source_details,
+            opener_policy=AcceptAnythingPolicy())
         self.assertEqual(
             CodeImportWorkerExitCode.FAILURE_INVALID, worker.run())
 
     def test_invalid(self):
         # If there is no branch in the target URL, exit with FAILURE_INVALID
         worker = self.makeImportWorker(self.factory.makeCodeImportSourceDetails(
-            rcstype=self.rcstype, url="file:///path/non/existant"))
+            rcstype=self.rcstype, url="http://localhost/path/non/existant";),
+            opener_policy=AcceptAnythingPolicy())
         self.assertEqual(
             CodeImportWorkerExitCode.FAILURE_INVALID, worker.run())
 
+    def test_forbidden(self):
+        # If the branch specified is using an invalid scheme, exit with
+        # FAILURE_FORBIDDEN
+        worker = self.makeImportWorker(self.factory.makeCodeImportSourceDetails(
+            rcstype=self.rcstype, url="file:///local/path"),
+            opener_policy=CodeImportBranchOpenPolicy())
+        self.assertEqual(
+            CodeImportWorkerExitCode.FAILURE_FORBIDDEN, worker.run())
+
     def test_unsupported_feature(self):
         # If there is no branch in the target URL, exit with FAILURE_INVALID
         worker = self.makeImportWorker(self.makeSourceDetails(
-            'trunk', [('bzr\\doesnt\\support\\this', 'Original contents')]))
+            'trunk', [('bzr\\doesnt\\support\\this', 'Original contents')]),
+            opener_policy=AcceptAnythingPolicy())
         self.assertEqual(
             CodeImportWorkerExitCode.FAILURE_UNSUPPORTED_FEATURE, worker.run())
 
@@ -1025,7 +1058,8 @@
         # Only config.codeimport.revisions_import_limit will be imported in a
         # given run.
         worker = self.makeImportWorker(self.makeSourceDetails(
-            'trunk', [('README', 'Original contents')]))
+            'trunk', [('README', 'Original contents')]),
+            opener_policy=AcceptAnythingPolicy())
         self.makeForeignCommit(worker.source_details)
         self.assertTrue(self.foreign_commit_count > 1)
         self.pushConfig(
@@ -1062,11 +1096,12 @@
         mapdbs().clear()
         WorkerTest.tearDown(self)
 
-    def makeImportWorker(self, source_details):
+    def makeImportWorker(self, source_details, opener_policy):
         """Make a new `ImportWorker`."""
         return GitImportWorker(
             source_details, self.get_transport('import_data'),
-            self.bazaar_store, logging.getLogger())
+            self.bazaar_store, logging.getLogger(),
+            opener_policy=opener_policy)
 
     def makeForeignCommit(self, source_details):
         """Change the foreign tree, generating exactly one commit."""
@@ -1112,11 +1147,12 @@
         mapdbs().clear()
         WorkerTest.tearDown(self)
 
-    def makeImportWorker(self, source_details):
+    def makeImportWorker(self, source_details, opener_policy):
         """Make a new `ImportWorker`."""
         return HgImportWorker(
             source_details, self.get_transport('import_data'),
-            self.bazaar_store, logging.getLogger())
+            self.bazaar_store, logging.getLogger(),
+            opener_policy=opener_policy)
 
     def makeForeignCommit(self, source_details):
         """Change the foreign tree, generating exactly one commit."""
@@ -1151,8 +1187,38 @@
         load_optional_plugin('svn')
         self.setUpImport()
 
-    def makeImportWorker(self, source_details):
+    def makeImportWorker(self, source_details, opener_policy):
         """Make a new `ImportWorker`."""
         return BzrSvnImportWorker(
             source_details, self.get_transport('import_data'),
-            self.bazaar_store, logging.getLogger())
+            self.bazaar_store, logging.getLogger(),
+            opener_policy=opener_policy)
+
+
+class CodeImportBranchOpenPolicyTests(TestCase):
+
+    def setUp(self):
+        super(CodeImportBranchOpenPolicyTests, self).setUp()
+        self.policy = CodeImportBranchOpenPolicy()
+
+    def test_follows_references(self):
+        self.assertEquals(True, self.policy.shouldFollowReferences())
+
+    def assertBadUrl(self, url):
+        self.assertRaises(BadUrl, self.policy.checkOneURL, url)
+
+    def assertGoodUrl(self, url):
+        self.policy.checkOneURL(url)
+
+    def test_checkOneURL(self):
+        self.assertBadUrl("sftp://somehost/";)
+        self.assertBadUrl("/etc/passwd")
+        self.assertBadUrl("file:///etc/passwd")
+        self.assertBadUrl("bzr+ssh://devpad/")
+        self.assertBadUrl("bzr+ssh://devpad/")
+        self.assertBadUrl("unknown-scheme://devpad/")
+        self.assertGoodUrl("http://svn.example/branches/trunk";)
+        self.assertGoodUrl("http://user:password@svn.example/branches/trunk";)
+        self.assertBadUrl("svn+ssh://svn.example.com/bla")
+        self.assertGoodUrl("git://git.example.com/repo")
+        self.assertGoodUrl("https://hg.example.com/hg/repo/branch";)

=== modified file 'lib/lp/codehosting/codeimport/tests/test_workermonitor.py'
--- lib/lp/codehosting/codeimport/tests/test_workermonitor.py	2011-08-22 11:56:26 +0000
+++ lib/lp/codehosting/codeimport/tests/test_workermonitor.py	2011-08-24 20:46:19 +0000
@@ -222,12 +222,14 @@
     def makeWorkerMonitorWithJob(self, job_id=1, job_data=()):
         return self.WorkerMonitor(
             job_id, BufferLogger(),
-            FakeCodeImportScheduleEndpointProxy({job_id: job_data}))
+            FakeCodeImportScheduleEndpointProxy({job_id: job_data}),
+            "anything")
 
     def makeWorkerMonitorWithoutJob(self, exception=None):
         return self.WorkerMonitor(
             1, BufferLogger(),
-            FakeCodeImportScheduleEndpointProxy({}, exception))
+            FakeCodeImportScheduleEndpointProxy({}, exception),
+            None)
 
     def test_getWorkerArguments(self):
         # getWorkerArguments returns a deferred that fires with the
@@ -732,8 +734,9 @@
 
         This implementation does it in-process.
         """
+        logger = BufferLogger()
         monitor = CIWorkerMonitorForTesting(
-            job_id, BufferLogger(),
+            job_id, logger,
             xmlrpc.Proxy(config.codeimportdispatcher.codeimportscheduler_url))
         deferred = monitor.run()
         def save_protocol_object(result):
@@ -828,6 +831,7 @@
         interpreter = '%s/bin/py' % config.root
         reactor.spawnProcess(
             DeferredOnExit(process_end_deferred), interpreter,
-            [interpreter, script_path, str(job_id), '-q'],
+            [interpreter, script_path, '--access-policy=anything', str(job_id),
+                '-q'],
             childFDs={0:0, 1:1, 2:2}, env=os.environ)
         return process_end_deferred

=== modified file 'lib/lp/codehosting/codeimport/worker.py'
--- lib/lp/codehosting/codeimport/worker.py	2011-08-02 11:28:46 +0000
+++ lib/lp/codehosting/codeimport/worker.py	2011-08-24 20:46:19 +0000
@@ -8,6 +8,7 @@
     'BazaarBranchStore',
     'BzrSvnImportWorker',
     'CSCVSImportWorker',
+    'CodeImportBranchOpenPolicy',
     'CodeImportSourceDetails',
     'CodeImportWorkerExitCode',
     'ForeignTreeStore',
@@ -49,11 +50,23 @@
 import SCM
 
 from canonical.config import config
+
+from lazr.uri import (
+    InvalidURIError,
+    URI,
+    )
+
 from lp.code.enums import RevisionControlSystems
+from lp.code.interfaces.branch import get_blacklisted_hostnames
 from lp.codehosting.codeimport.foreigntree import (
     CVSWorkingTree,
     SubversionWorkingTree,
     )
+from lp.codehosting.safe_open import (
+    BadUrl,
+    BranchOpenPolicy,
+    SafeBranchOpener,
+    )
 from lp.codehosting.codeimport.tarball import (
     create_tarball,
     extract_tarball,
@@ -62,6 +75,53 @@
 from lp.services.propertycache import cachedproperty
 
 
+class CodeImportBranchOpenPolicy(BranchOpenPolicy):
+    """Branch open policy for code imports.
+
+    In summary:
+     - follow references,
+     - only open non-Launchpad URLs
+     - only open the allowed schemes
+    """
+
+    allowed_schemes = ['http', 'https', 'svn', 'git', 'ftp']
+
+    def shouldFollowReferences(self):
+        """See `BranchOpenPolicy.shouldFollowReferences`.
+
+        We traverse branch references for MIRRORED branches because they
+        provide a useful redirection mechanism and we want to be consistent
+        with the bzr command line.
+        """
+        return True
+
+    def transformFallbackLocation(self, branch, url):
+        """See `BranchOpenPolicy.transformFallbackLocation`.
+
+        For mirrored branches, we stack on whatever the remote branch claims
+        to stack on, but this URL still needs to be checked.
+        """
+        return urljoin(branch.base, url), True
+
+    def checkOneURL(self, url):
+        """See `BranchOpenPolicy.checkOneURL`.
+
+        We refuse to mirror from Launchpad or a ssh-like or file URL.
+        """
+        try:
+            uri = URI(url)
+        except InvalidURIError:
+            raise BadUrl(url)
+        launchpad_domain = config.vhost.mainsite.hostname
+        if uri.underDomain(launchpad_domain):
+            raise BadUrl(url)
+        for hostname in get_blacklisted_hostnames():
+            if uri.underDomain(hostname):
+                raise BadUrl(url)
+        if uri.scheme not in self.allowed_schemes:
+            raise BadUrl(url)
+
+
 class CodeImportWorkerExitCode:
     """Exit codes used by the code import worker script."""
 
@@ -71,6 +131,7 @@
     SUCCESS_PARTIAL = 3
     FAILURE_INVALID = 4
     FAILURE_UNSUPPORTED_FEATURE = 5
+    FAILURE_FORBIDDEN = 6
 
 
 class BazaarBranchStore:
@@ -187,9 +248,9 @@
 
     :ivar branch_id: The id of the branch associated to this code import, used
         for locating the existing import and the foreign tree.
-    :ivar rcstype: 'svn' or 'cvs' as appropriate.
+    :ivar rcstype: 'svn', 'cvs', 'hg', 'git', 'bzr-svn', as appropriate.
     :ivar url: The branch URL if rcstype in ['svn', 'bzr-svn',
-        'git'], None otherwise.
+        'git', 'hg'], None otherwise.
     :ivar cvs_root: The $CVSROOT if rcstype == 'cvs', None otherwise.
     :ivar cvs_module: The CVS module if rcstype == 'cvs', None otherwise.
     """
@@ -415,7 +476,7 @@
     required_format = BzrDirFormat.get_default_format()
 
     def __init__(self, source_details, import_data_transport,
-                 bazaar_branch_store, logger):
+                 bazaar_branch_store, logger, opener_policy):
         """Construct an `ImportWorker`.
 
         :param source_details: A `CodeImportSourceDetails` object.
@@ -423,12 +484,15 @@
             uses this to fetch and store the Bazaar branches that are created
             and updated during the import process.
         :param logger: A `Logger` to pass to cscvs.
+        :param opener_policy: Policy object that decides what branches can
+             be imported
         """
         self.source_details = source_details
         self.bazaar_branch_store = bazaar_branch_store
         self.import_data_store = ImportDataStore(
             import_data_transport, self.source_details)
         self._logger = logger
+        self._opener_policy = opener_policy
 
     def getBazaarBranch(self):
         """Return the Bazaar `Branch` that we are importing into."""
@@ -601,11 +665,17 @@
     def _doImport(self):
         self._logger.info("Starting job.")
         saved_factory = bzrlib.ui.ui_factory
+        opener = SafeBranchOpener(self._opener_policy)
         bzrlib.ui.ui_factory = LoggingUIFactory(logger=self._logger)
         try:
             self._logger.info(
                 "Getting exising bzr branch from central store.")
             bazaar_branch = self.getBazaarBranch()
+            try:
+                self._opener_policy.checkOneURL(self.source_details.url)
+            except BadUrl, e:
+                self._logger.info("Invalid URL: %s" % e)
+                return CodeImportWorkerExitCode.FAILURE_FORBIDDEN
             transport = get_transport(self.source_details.url)
             for prober_kls in self.probers:
                 prober = prober_kls()
@@ -617,7 +687,13 @@
             else:
                 self._logger.info("No branch found at remote location.")
                 return CodeImportWorkerExitCode.FAILURE_INVALID
-            remote_branch = format.open(transport).open_branch()
+            remote_dir = format.open(transport)
+            try:
+                remote_branch = opener.runWithTransformFallbackLocationHookInstalled(
+                    remote_dir.open_branch)
+            except BadUrl, e:
+                self._logger.info("Invalid URL: %s" % e)
+                return CodeImportWorkerExitCode.FAILURE_FORBIDDEN
             remote_branch_tip = remote_branch.last_revision()
             inter_branch = InterBranch.get(remote_branch, bazaar_branch)
             self._logger.info("Importing branch.")

=== modified file 'lib/lp/codehosting/codeimport/workermonitor.py'
--- lib/lp/codehosting/codeimport/workermonitor.py	2011-06-16 23:43:04 +0000
+++ lib/lp/codehosting/codeimport/workermonitor.py	2011-08-24 20:46:19 +0000
@@ -125,7 +125,7 @@
     path_to_script = os.path.join(
         config.root, 'scripts', 'code-import-worker.py')
 
-    def __init__(self, job_id, logger, codeimport_endpoint):
+    def __init__(self, job_id, logger, codeimport_endpoint, access_policy):
         """Construct an instance.
 
         :param job_id: The ID of the CodeImportJob we are to work on.
@@ -138,6 +138,7 @@
         self._log_file = tempfile.TemporaryFile()
         self._branch_url = None
         self._log_file_name = 'no-name-set.txt'
+        self._access_policy = access_policy
 
     def _logOopsFromFailure(self, failure):
         request = log_oops_from_failure(
@@ -223,7 +224,10 @@
         deferred = defer.Deferred()
         protocol = self._makeProcessProtocol(deferred)
         interpreter = '%s/bin/py' % config.root
-        command = [interpreter, self.path_to_script] + worker_arguments
+        args = [interpreter, self.path_to_script]
+        if self._access_policy is not None:
+            args.append("--access-policy=%s" % self._access_policy)
+        command = args + worker_arguments
         self._logger.info(
             "Launching worker child process %s.", command)
         reactor.spawnProcess(
@@ -257,6 +261,8 @@
                 CodeImportResultStatus.FAILURE_UNSUPPORTED_FEATURE,
             CodeImportWorkerExitCode.FAILURE_INVALID:
                 CodeImportResultStatus.FAILURE_INVALID,
+            CodeImportWorkerExitCode.FAILURE_FORBIDDEN:
+                CodeImportResultStatus.FAILURE_FORBIDDEN,
                 }
         if isinstance(reason, failure.Failure):
             if reason.check(error.ProcessTerminated):

=== modified file 'scripts/code-import-worker-monitor.py'
--- scripts/code-import-worker-monitor.py	2011-02-21 00:06:55 +0000
+++ scripts/code-import-worker-monitor.py	2011-08-24 20:46:19 +0000
@@ -38,6 +38,11 @@
         LaunchpadScript.__init__(self, name, dbuser, test_args)
         set_up_oops_reporting('codeimportworker', name, mangle_stdout=True)
 
+    def add_my_options(self):
+        """See `LaunchpadScript`."""
+        self.parser.add_option("--access-policy", type="choice", metavar="ACCESS_POLICY",
+            choices=["anything", "default"], default=None)
+
     def _init_db(self, isolation):
         # This script doesn't access the database.
         pass

=== modified file 'scripts/code-import-worker.py'
--- scripts/code-import-worker.py	2011-06-16 23:43:04 +0000
+++ scripts/code-import-worker.py	2011-08-24 20:46:19 +0000
@@ -26,17 +26,25 @@
 from canonical.config import config
 from lp.codehosting import load_optional_plugin
 from lp.codehosting.codeimport.worker import (
-    BzrSvnImportWorker, CSCVSImportWorker, CodeImportSourceDetails,
-    GitImportWorker, HgImportWorker, get_default_bazaar_branch_store)
+    BzrSvnImportWorker, CSCVSImportWorker, CodeImportBranchOpenPolicy,
+    CodeImportSourceDetails, GitImportWorker, HgImportWorker,
+    get_default_bazaar_branch_store)
+from lp.codehosting.safe_open import AcceptAnythingPolicy
 from canonical.launchpad import scripts
 
 
+opener_policies = {
+    "anything": AcceptAnythingPolicy(),
+    "default": CodeImportBranchOpenPolicy()
+    }
+
+
 def force_bzr_to_use_urllib():
     """Prevent bzr from using pycurl to connect to http: urls.
 
     We want this because pycurl rejects self signed certificates, which
     prevents a significant number of import branchs from updating.  Also see
-    https://bugs.edge.launchpad.net/bzr/+bug/516222.
+    https://bugs.launchpad.net/bzr/+bug/516222.
     """
     from bzrlib.transport import register_lazy_transport
     register_lazy_transport('http://', 'bzrlib.transport.http._urllib',
@@ -50,8 +58,10 @@
     def __init__(self):
         parser = OptionParser()
         scripts.logger_options(parser)
-        options, self.args = parser.parse_args()
-        self.logger = scripts.logger(options, 'code-import-worker')
+        parser.add_option("--access-policy", type="choice", metavar="ACCESS_POLICY",
+            choices=["anything", "default"], default="default")
+        self.options, self.args = parser.parse_args()
+        self.logger = scripts.logger(self.options, 'code-import-worker')
 
     def main(self):
         force_bzr_to_use_urllib()
@@ -73,7 +83,8 @@
         import_worker = import_worker_cls(
             source_details,
             get_transport(config.codeimport.foreign_tree_store),
-            get_default_bazaar_branch_store(), self.logger)
+            get_default_bazaar_branch_store(), self.logger,
+            opener_policies[self.options.access_policy])
         return import_worker.run()