launchpad-reviewers team mailing list archive
-
launchpad-reviewers team
-
Mailing list archive
-
Message #25380
[Merge] ~ilasc/turnip:stats-collection-point into turnip:master
Ioana Lasc has proposed merging ~ilasc/turnip:stats-collection-point into turnip:master.
Commit message:
Add metrics collection points to GitProcessProtocol
Requested reviews:
Launchpad code reviewers (launchpad-reviewers)
For more details, see:
https://code.launchpad.net/~ilasc/turnip/+git/turnip/+merge/391365
This MP uses print statements to show points in the code where we could potentially collect stats to be sent to statsd. The purpose of the MP is to give us a better starting point in discussing how and where is best to collect stats: in the PackBackendProtocol over pipes, inside the GitProcessProtocol itself, in a separate thread or otherwise.
So far I've tried several approaches:
1. wrap "default_reactor.spawnProcess(self.peer, cmd, args, env=env)" into a subprocess.call
2. write to GitProcessProtocol from the PackBackendProtocol and back
3. add additional fd to "default_reactor.spawnProcess(self.peer, cmd, args, env=env)" in the form "childFDs = { 0: "w", 1: "r", 2: "r" , 4: "w"}" and attempted to call writeToChild(4, _data_) but that didn't work as PackBackendProtocol is not a twisted.internet.protocol.ProcessProtocol
4. from the failure of option 3 I thought if I could introduce another layer of twisted.internet.protocol.ProcessProtocol between PBP and GitProcessProtocol I might be able to use additional pipes to preserve Git's normal output as well as communicate metrics.
5. spawn a "self written" separate thread to monitor the GitProcessProtocol process from outside
Essentially all 4 major approaches failed by breaking existing PBP <-> Git pipes or by not being able to pass metrics data through, I didn't pursue the 5th as I understand we do not want to go down the multithreading route.
The current proposal uses the GitProcessProtocol itself as a stats collection point.Running the "test_a_single_clone_and_push" unit test will produce:
turnip.pack.tests.test_functional.TestBackendFunctional.test_a_single_clone_and_push (v0 protocol) ...
PID 4545 rss with psutil: 0 MB
PID 4545 maxrss with getrusage: 88 MB
PID 4545 cpu utilization as a percentage 0.0
PID 4545 hostname: turnip-bionic
PID 4545 snapshot CPU times: pcputimes(user=0.0, system=0.0, children_user=0.0, children_system=0.0, iowait=0.0)
PID 4545 execution time: 0.856162071228
PID 4545 command: git-upload-pack
PID 4545 repository: /test
PID 4551 rss with psutil: 72 MB
PID 4551 maxrss with getrusage: 88 MB
PID 4551 cpu utilization as a percentage 0.0
PID 4551 hostname: turnip-bionic
PID 4551 snapshot CPU times: pcputimes(user=0.0, system=0.0, children_user=0.0, children_system=0.0, iowait=0.0)
PID 4551 execution time: 0.970222949982
PID 4551 command: git-receive-pack
PID 4551 repository: /test
ok
--
Your team Launchpad code reviewers is requested to review the proposed merge of ~ilasc/turnip:stats-collection-point into turnip:master.
diff --git a/turnip/pack/git.py b/turnip/pack/git.py
index 4407085..f0d9244 100644
--- a/turnip/pack/git.py
+++ b/turnip/pack/git.py
@@ -12,8 +12,11 @@ __metaclass__ = type
import uuid
-
+import psutil
+import resource
import six
+import socket
+import time
from twisted.internet import (
defer,
error,
@@ -242,14 +245,34 @@ class GitProcessProtocol(protocol.ProcessProtocol):
def __init__(self, peer):
self.peer = peer
self.out_started = False
+ self.create_time = 0
+ self.pid = 0
+ self.os_process = None
+ self.monitoring_thread = None
+
+ # increase number of processes counter here
+ # and spwan send to statsd
+ #self.statsd_client.incr('turnip.hostname.total.processes', rate=1)
def connectionMade(self):
+ self.pid = self.transport.pid
+ self.os_process = psutil.Process(pid = self.pid)
+ self.create_time = self.os_process.create_time()
+
self.peer.setPeer(self)
self.peer.transport.registerProducer(self, True)
self.transport.registerProducer(
UnstoppableProducerWrapper(self.peer.transport), True)
self.peer.resumeProducing()
+ print('\n')
+ print('PID ',self.pid, ' rss with psutil: ', self.os_process.memory_info().rss / 1024**2, ' MB')
+ print('PID ',self.pid, ' maxrss with getrusage: ', resource.getrusage(resource.RUSAGE_SELF).ru_maxrss / 1024, ' MB')
+ print('PID ',self.pid, ' cpu utilization as a percentage ', self.os_process.cpu_percent() / psutil.cpu_count())
+ print('PID ',self.pid, ' hostname: ', socket.gethostname())
+ print('PID ',self.pid, ' snapshot CPU times: ', self.os_process.cpu_times())
+
+
def outReceived(self, data):
self.out_started = True
self.peer.sendRawData(data)
@@ -292,6 +315,22 @@ class GitProcessProtocol(protocol.ProcessProtocol):
self.transport.closeChildFD(0)
def processEnded(self, reason):
+ # this instance of the GitProcessProtocol just ended
+ # we record it duration for statsd metrics here
+ process_duration = time.time() - self.create_time
+ if (self.peer.command == b'git-upload-pack' or
+ self.peer.command == b'git-receive-pack'):
+ print('PID ',self.pid, ' execution time:',
+ (time.time() - self.create_time))
+ print('PID ',self.pid, ' command:', self.peer.command)
+ print('PID ',self.pid, ' repository: ',self.peer.raw_pathname)
+
+ #self.statsd_client.timing('turnip.hostname.operation.timer', process_duration)
+
+ # decrease number of processes counter here
+ # and spwan send to statsd
+ #self.statsd_client.decr('turnip.hostname.total.processes')
+
if reason.check(error.ProcessDone) and reason.value.exitCode != 0:
code = reason.value.exitCode
self.peer.log.info(
diff --git a/turnip/pack/tests/test_functional.py b/turnip/pack/tests/test_functional.py
index c06c4e2..34ea4f4 100644
--- a/turnip/pack/tests/test_functional.py
+++ b/turnip/pack/tests/test_functional.py
@@ -179,6 +179,28 @@ class FunctionalTestMixin(WithScenarios):
defer.returnValue((out, err))
@defer.inlineCallbacks
+ def test_a_single_clone_and_push(self):
+ # Test a full clone, commit, push, clone, commit, push, pull
+ # cycle using the backend server.
+ test_root = self.useFixture(TempDir()).path
+ clone1 = os.path.join(test_root, 'clone1')
+
+ # Clone the empty repo from the backend and commit to it.
+ yield self.assertCommandSuccess((b'git', b'clone', self.url, clone1))
+ yield self.assertCommandSuccess(
+ (b'git', b'config', b'user.name', b'Test User'), path=clone1)
+ yield self.assertCommandSuccess(
+ (b'git', b'config', b'user.email', b'test@xxxxxxxxxxx'),
+ path=clone1)
+ yield self.assertCommandSuccess(
+ (b'git', b'commit', b'--allow-empty', b'-m', b'Committed test'),
+ path=clone1)
+
+ # Push it back up to the backend.
+ yield self.assertCommandSuccess(
+ (b'git', b'push', b'origin', b'master'), path=clone1)
+
+ @defer.inlineCallbacks
def test_clone_and_push(self):
# Test a full clone, commit, push, clone, commit, push, pull
# cycle using the backend server.